///IS/| -CP- loll 2-VOL'Z 


NASA-CP-101 12-VOL-2 
19930020367 



A Service of: 



National Aeronautics and 
Space Administration 

Scientific and Technical 
Information Program Office 

Center for AeroSpace Information 



NOTICE 


THIS REPRODUCTION WAS MADE FROM THE BEST AVAILABLE 
COPY OF WHICH A NUMBER OF PAGES WERE OF POOR 
REPRODUCTION QUALITY. 




■*/ 


IASA-CP-10112-Vol-2) NORTH 
1ERICAN FUZZY LOGIC PROCESSING 
1CIETY (NAFIPS 1992), VOLUME 2 
I ASA) 292 p 


N93— 29556 
— THRU — 
N93-29585 
Unci as 


G3/59 0161840 



NASA Conference Publication 10112, Vol. II 

NORTH AMERICAN 
FUZZY INFORMATION 
PROCESSING SOCIETY 

(NAFIPS ’92) 


James Villarreal, Compiler 
NASA Lyndon B. Johnson Space Center 
Houston , Texas 

Proceedings of a conference sponsored by the 
North American Fuzzy Information Processing Society 
in cooperation with the National Aeronautics and Space 
Administration, the Institute Tecnotogfco de Morelia, 
the Indian Society ter Fuzzy Mathematics and Information 
Processing, the Institute Tecnologico de Esfudios Superiores 
de Monterrey, the International Fuzzy Theory and Systems, 
and the Microelectronics and Computer Technology Corporation, 
and held at the Mella Hotel Pasco de la Marina Sur, Marina 
Vallarta, Puerto Vattarta. Mexico 

December 15-17, 1992 


National Aeronautics and Space Administration 
Lyndon B. Johnson Space Center 
Houston. Texas 
1992 




NAFIPS *92, an International conference on fuzzy set theory and applications, Is sponsored 
by NAFIPS, in cooperation with: 

- National Aeronautics and Space Administration (NASA) 
institute Tecnologlco de Morelia 

«• Indian Society for Fuzzy Mathematics and Information Processing(ISFl)MlP) 

* institute Tecnologlco de Estudios Superiores de Monterrey (ITESM) 

* International Fuzzy Systems Association (1FSA) 

, 1 ** Japan Sodety for Fuzzy Theory and Systems 

* Microelectronics and Computer Technology Corporation (MCC) 

Fuzzy set theory has led to a large number of diverse applications. Recently, interesting 
applications have been developed which involve the integration of fuzzy systems with 
adaptive processes such as neural networks and genetic algorithms. NAFIPS '92 will be 
directed toward the advancement, commercialization, and engineering development of these 
technologies. 

The conference will consist of both plenary sessions and contributory sessions. The plenary 
sessions will be addressed by leading experts. Topics to be discussed at this conference 
include the following: 

* Biomedical and Biochemical Issues 

* Business and Decision Making 

* Commercial Products and Tools 

* Computer Systems and Information Processing 

- Control Systems 
« Decision Analysis 

* Foundations and Mathematical issues 

* Genetic Algorithms 

* Hardware 

* Image Processing and Vision 

* Neural Networks 

* Optimization 

* Path Planning 

* Pattern Recognition 

* Robotics 


a 





Honorary Chair 
Lotfi Zadeh 

University of California, Berkeley 


Ganaral Chair 
Robert N. Lea 

NASA - Lyndon B. Johnson Space Center 
Houston, TX 77058 


Program 

James A. Villarreal 
Software Technology Branch 
Lyndon B. Johnson Space Center 
NASA 

Houston, TX 77058 
Tel: (713)483-8076 

EMAL: JAMESOGOTHAVCrTYJSC.NASA.GOV 


Co-Cha!rs 

Salvador Guiterrez Martinez 
Instituto Tecnologico de 
Morelia, 

Av. Tecnologico 1500 
Col. Lomas de Santiaguito 
58120 Morelia 
Morelia, Michoacan, Mexico 


Program 


Co 


MMITTEE 


] 



Advisory Board 

G. KMr 

J. Bazdak 

R. Yagar 

J. Kefter 

L Hal 

M. Zamankova 

T. Yamakawa 

M. Sugano 

S. K. Pal 

Y. Jani 

LTurksan 

M. Murphy 


Committee 


J. Aldridge 

C. Josiyn 

B. Schott 

H. Barer# 

C. Karr 

H. Shahadah 

J. Bazdek 

J. Kallar 

R. 0. Shelton 

P. Bonisonna 

J. Rowing 

S. Shanoi 

B. Bucklas 

D. Kraft 

M. Shipley 

J. Bucklay 

V. Krainovich 

R E. Smith 

S. Chiu 

R Krishnapuram 

S. MHra 

T. Cteghom 

R N. Laa 

T.Takagi 

C. Copeland 

A. Laigh 

M. Valenzuela 

A. deKorvm 

D. Lao 

J. A Villarreal 

J. Dockary 

M. Murphy 

L Wang 

D. Dubois 

S.Ha» 

P. Wang 

T. Espy 

C. Nagotta 

H. Wat an aba 

W. GMespia 

S.K.PJU 

W.Waa 

M. Groff 

A Patios 

J. Weiss 

S. Guitarraz-Martinaz 

F. Patry 

1. A Willson 

L Hal) 

N. Plhjgar 

T. Whalen 

T. Hanson 

F. Pin 

R Yagar 

J. Hobiitt 

P. A Ramamoothy 

J. Yao 

D. Hudson 

ARarnar 

J. Yen 

Y. Jani 

E. Ruspini 

H. Ying 

B. Jansan 

E. Sanchaz 

M. Zamankova 


I 


vl • 

b egi v m aittiumui 







analytic review fork 


(Initials) 


Priaary Record IPS # 


Docuaent should not roeoiwo 
analytic troataont 










1 

TUTORIALS 

1 

Tutorials by leading experts will be provided on December 14, 1992. 


8:00 - 9:40 

Introduction to Fuzzy Sets and Approximate Reasoning 
Ronald R. yager Iona Cottege, New Rochelle, NY, USA 


9:50 - 11:30 

Fuzzy Intelligent Information Systems 

M. ZEMANKOVA, NATIONAL Science Foundation, Washington, DC, USA 


11:30 - 12:30 

Lunch 


12:30 - 2:10 

Fuzzy Logic In Export System and Its Applications for IE/OR/MS 
I.B. TURKSEN, University of Toronto, Toronto, ON, CANADA 


2:20 4:00 

Fuzzy Control and its Applications 

M. SUGENO, Tokyo Inst lute of Technology, Yokohama, JAPAN 


4:10 - 5:50 

Fuzzy Hardware Design and Its Applications 
K. HIROTA, Hosei UntversRy, Tokyo, JAPAN 


1 

CONFERENCE 

l 

Tuesday. December 15. 1.992 


8:00 

Welcoming Remarks 


8:15-9:00 

Plenary Speech 

Professor Lotfi Zadeh, University of CsWomia at Berkeley 


9:00 - 12:00 

Parallel Seaalone 


1 


'' v ‘ i 


I •- 


An Analysis of Posslbls Applications of Fuzzy Stt Theory to the Credibility Theory 

KRZYSZTOF OSTASZEWSKl Unlversty of Louisville, Louisville, KY 
WALDEMAR KARWOWSW, Unlversty of Louisvie, LoufevSte, KY 

Estimations of Expected rest and Potential Surprise In Possibility Theory 
HENRI PRAOE, Universlte Paul Sabatier, Toulouse Cedex, FRANCE 
RONALD R. YAGER, Iona College, New Pochette, NY 

Comparison of Specificity and Information for Fuzzy Domains 
ARTHUR RAMER, University of New South Wales, Kensington, AUSTRALIA 

The Axiomatic Definition of a Linguistic Scale Fuzziness Degree, Its Major Properties and 
Applications 

ALEXANDER P. RYJOV, Soviet Association of Fuzzy Systems, Moscow, RUSSIA 

How to Select Combination Operators for Fuzzy Expert Systems Using CRI 
I.B, TURKSEN, University of Toronto. Toronto, Ontario, CANADA 
Y. TlAN, University of Toronto, Toronto, Ontario, CANADA 

Approximate Reasoning Using Terminological Models 
JOHN YEN. Texas A&M University, Colege Station, TX 
NlTlN VAIDYA, Texas A&M University, College Station. TX 


vii 




■ m.i.iiwuu, auar- 


4 







Quantitative Analytla of Propartlaa and Spatial Relatione of Fuzzy Image Reglone 
Raghu Krshnapuram, Universty of ftdssourf, Columbia, MO 
JAMES M. KELLER, Univetsly of Missouri, Columbia, MO 
YBMG MA, University of Missouri, Columbia, MO 

A Fuzzy Clustering Algorithm to Detect Planar and Quadric Shapes 
raghu KRISHNAPURAM, Univemly ol Msaouri, Columbia, MO 
HICHEM FRO' University of Missouri, Columbia, MO 
OLFANASRAt A, University of Missouri, Columbia, MO 

A Fuzzy Measure Approach to Motion Frame Analysis for Scene Detection 
ALBERT B. LEIGH, McDorwel Douglas Space Systems, Houston, TX 
SANKAR K. PAL. Incflan Statistical Insttute, Calcutta, INDIA 

Automatic Rule Generation for High-Level Vision 

FRANK CHUNG-HOONRHEE, University of Missouri. Columbia, MO 
RAGHU KRISHNAPURAM, Univensly of Missouri, Columbia, MO 

Encoding Spatial Images - A Fuzzy Set's Theory Approach 
LESZEK M. SZTANDERA, University of Toledo, ToteJo, OH 

image Segmentation Using LVQ Clustering Networks 

Eric CHEN-KUO TSAO, The University of West Florida, Pensacola, FL 
JAMESC. BEZDEK, The University of We* Florida, Pensacola, FL 
NKHIL R. PAL, The University of West Florida, Pensacola, FL 

12:00 - 1:00 Lunch 

1:00 - 3:30 Parallel Sessions 



A Neuro-Fuzzy Architecture for Real-Time Applications 
P. A. RAMAMOORTHY, University ot Cincinnati, Cincinnati, OH 
song HUANG, University of Cincinnati. Cincinnati, OH 

A Composite Self Tuning Strategy for Fuzzy Control of Dynamic Systems 
C-Y SHIEH, University of Missouri, Columbia, MO 
Saush S. NAIR. University of Missouri. Columbia. MO 

A Self-Learning Rule Base for Command Following in Dynamical Systems 
WEI K. TSAI, University of California at Irvine, Irvine, CA 
HON-MUN LEE, University of CaHomia at Irvine, Irvine, CA 
ALEXANDER PARLOS, Texas A&M University, Colege Station, TX 

Adaptive Defuzzification for Fuzzy Systems Modeling 
RONALD R. YAGER, Iona College, New RocheMe. NY 
DIMITAR P. FILEY, Iona College, New Rochete, NY 

3eslgn Issues of a Relnforcsment-Based Seif-Learning Fuzzy Controller for Petrochemical 

3 rocess Control 

John yen, Texas A&M University, Colege Station, TX 
HAOJM WANG, Texas A&M Universly, Colege Station, TX 
WALTER C. DALGHERITY, Texas A&M University, College Station, TX 


viH 


r 

■A 












Learning Characteristics of a Space Tima Neural Network as a Tether Sklprope Observer 
ROBERT N. LEA. NASA/Johnson Space Carter. Houston, TX 
JAMES A. VILLARREAL. NASA/Johnson Space Center. Houston. TX 
jani YASHVANT, Togal Infralogic Inc., Houston. TX 
CHARLES COPELAND. Loral Space systems. Houston. TX 

Clustering of Tethered Satellite System Simulation Data by an Adaptive Neuro-Fuzzy 

Algorithm 

SUNANDA MITRA, Texas Tech Universky. Lubbock, TX 
SURYA PEMMARAJU, Texas Tech Universky, Lubbock, TX 

Character RecognHIon Using a Neural Network Model with Fuzzy Representation 
NASSRIN TAVAKOU, Universky ol North CaroBna at Charlotte, Charlotte, NC 
DAVID SENIW, UnNersky of North Carokna at Charlotte, 

Charlotte, NC 

Designing a Fuzzy Scheduler for Hard Real-Time Systems 
JOHN YEN, Texas A&M Universky. Colege Station, TX 
JONATHAN Lffi. Texas ASM Universky, Colege Station, TX 
NATHAN PFLUGBt. Texas ASM Urtversky, Cokege Staten, TX 
SWAMl NATARAJAN, Texas- ASM Universky, College Staten, TX 

WARP: Weight Associative Rule Processor A Dedicated VLSI Fuzzy Logie Megacell 
ANDREA PAGNI, SGS-Thompson Microelectronics. Agrate Brianza (Ml) ITALY 
R. POLUZZI, SGS-Thompeon Microelectronics, Agrate Brianza (Ml) ITALY 
G. G. RIZZOTTO, SGS-Thompeon Microelectronics, Agrate Brianza (Ml) ITALY 


Wednesday. December 16. 1992 


8.-00 - 8:45 Plenary Speech 

Piero Bonlssone, "Fuzzy Logic Control: From Development to 
Deployment (with an Application to Aircraft Engine Control)" 


8:45 - 10:45 Parallel Sessions 



Evaluation of Fuzzy Inference Systems Using Fuzzy Least Squares 
JOSEPH M. BARONE. Loki Software, Inc., Ltoeity Comer, NJ 

A Model for Amalgamation In Group Decision Making 

VINCENZO CUTELLO, Consorzio per la Rioetca sulla Mfcroelettronica del Mezzogiomo, Catania, ITALY 
JAVIER MONTERO, Complutense University, Madrid, Spain 

Fuzzy Forecasting and Decision Making In Short Dynamic Time Series 
EFIN JA. KARPOVSKY, Odessa Institute ol National Economy, Odessa, UKRAINE 

Decision Analysis With Approxlmata Probabilities 
THOMAS WHALEN, Georgia State Universky, Atlanta, GA 






ix 



Distributed Traffic Signal Control Using Rosy Logic 
STEPHEN CHRJ, Rockwet Wamatbnsl Science Csnlsr, 
Thousand Oaks. CA 

Intalligant Virtual Roalty In tho Sttlng of Fmzy Sts 
JOHN T. DOCKERY, Osccf# Mason Unlvwfsiy.Mrtoc.VA 
nAvo littuan. Gaortse Mason Unhwsty, Fairfax, VA 


Comparison of Crisp and Fuzzy Character Networks In Handwritten Word Recognition 
PAUL GADBLUrtverst/ of Missouri. Columbia, MO 
MA6DI MOHAMED, University of Missouri. Coiuito^MC) 

JUNG-HSEN CHIANQ, Uhiveraty of Missouri, Columbia, MO 


Fuzzy Neural Network Methodology AppHed I to ***^ D jW*£* 
MARIAN B. GORZALCZANY. Technical Universty of KkHce, Melee, POLAND 
MARY DEUTSCH-MCLEISH, Universty of Guelph. Guelph, Ontario, CANADA 


11:00 - 12:00 Parallel Sessions 



An Experimental Methodology for a Fuzzy Set Preference Model 
I.B. TURKSEN, Universty of Toronto, Toronto, ON, CANADA 
IAN A. WILLSON. Universty of Toronto, Toronto, ON, CANADA 


A Fuzzy Set Preference Model for Market Share Analysis 
l.B. TURKSEN, University of Toronto, Toronto, ON, CANADA 
IAN A. WILLSON. Universty of Toronto. Toronto. ON. CANADA 



Information Compression in the Context Model 

jorg GEBHARDT, Technical Universty of Braunschweig, Braunschweig, GERMANY 
RUDOLF KRUSE, Technical Universty of Braunschweig, Braunschweig, GERMANY 
DETLEF NAUCK, Technical Universfiy of Braunschweig, Braunschweig, GERMANY 


Fuzzy Knowledge Baae Conet ruction Through Ballet Networks Based on Lukasiewicz Logic 
FEUPE LARA-ROSANO, Universidad National Autonoma de Mexico, Mexico OF, MEXICO 


12:00 • 1:00 Lunch 

1:00 - 3:30 Parallel Sessions 



Intelligent Fuzzy Controller for Event-Driven Real Time Systems 
JANOS GRANTNER, Univetsiy of Minnesota, Minneapolis, MN 
MAREK PATYRA. Universty of Minnesota. Mkmeapots, MN 
MARIAN S. STACHOWtCZ, University of Minnesota, MlnneapoSs, MN 


Fuzzy Coordinator In Control Problems 

A. RUEDA, Universty of Mantoba. Winnipeg. Manttoba. CANADA 
W. PEDRYCZ, Universty of Manitoba, Winnipeg, Manttoba, CANADA 


x 




r ** VT3F" 






Tuning a Fuzzy Controller Using Quadratic Response Surfaces 
BRIAN SCHOTT, Georgia Slate UriversKy. Atlanta. GA 
THOMAS WHALBI, Georgia State University, Atlanta. GA 

The Cognitive Bases for the Design of a New Claes of Fuzzy Logie Controllers: The Clearness 

Transformation Fuzzy Logic Controlle r . 

lab IB SULTAN, York University, Tororto, Ontario, CANADA 
TAUB JANABI, Mentalogic Systems Inc., Markham, 

Ontario, CANADA 

A Fuzzy Control Design Case: The Fuzzy PLL 

H. N. TEOOORESCU, Polytechnic Insttute of lad, ROMANIA 

I. BOGDAN, Polytechnic Institute of Iasi, ROMANIA 


Adding Dynamic Rules to Self-Organizing Fuzzy Systems 
CATALIN V. BUHUSl, Romanian Academy, Calea Copou, Iasi, ROMANIA 

Fuzzy Learning Under and About an Unfamiliar Fuzzy Teacher 
BELUR V. DASARATHY, Dynetics, Huntsville, AL 

Some Problems with the Design of Self-Learning Management Systems 
ZiNY FUKOP, NYNEX Science and Technology, Inc., 

White Rains, NY 

A Neural Fuzzy Controller Learning by Fuzzy Errw Propagation 

DETLEF NAUCK. Technical University of Braunschweig. Braunschweig. GERMANY 
RUDOLF KRUSE, Technical University of Braunschweig, Braunschweig, GERMANY 


Thursday D ecember 17. 1992 
8:00 - 10:00 Parallel Sessions 



Determining Rules for Closing Customer Service Centers: A Public Utility Company's Fuzzy 


ANDRE DEKORVIN, University of Houston - Downtown, Houston, TX 
MARGARET F. SHIPLEY, University of Houston - Downtown, 
Houston, TX _ 

ROBERT N. LEA, NASA/ Johnson Space Center, Houston, TX 


Fuzzy Simulation In Cr ncurrent Englnssring 

A. KRASLAWSK1, Lappeenranta University of Technology, Lappeenranta, FINLAND 
L. NYSTROU, Lappeenranta University of Technology, Lappeenranta, FINLAND 


inverse Problems: Fuzzy Representation of Uncertainty Generates a Regularization 
V. KREINOV1CH, University of Texas at 0 Paso, El Paso, TX 
CHING-CHUANG CHANG, University of Texas at 0 Paso, 0 Paso, TX 
L REZNIK, Victoria University of Technology, MMC Metooume, 

VIC 3000, AUSTRALIA 

G. n. SOLOPCHENKO, St. Petersburg Technical University. St. Petersburg, RUSSIA 




Quantification of Human Raaponaaa 

RALPH C. STEJNIAGE, University of Dayton, Dayton, OH 
T. E. GANTNER, Urtversky of Dayton, Dayton, OH 
P. Y. W. UM, Boise Cascade R&D, Portland. OR 



Non-Scalar Uncertainty 

SALVADOR GUTERREZ-MAHT1NEZ, lns«uto Tecootogioo de MoreBa, Moreia, MEXICO 

Comparison Between the Performance of Two Classes of Fuzzy Controllers 
TAU8 H. JANABI, Mentalogic Systems Inc., Markham, Ontario, CANADA 
LH. SULTAN, York University, Toronto, Ontario, CANADA 

Possiblllstle Measurement and Set Statistics 
CUFF joslyn, SUNY-Binghamton, Portland, ME 

The Fusion of Information via Fuzzy integration 
JIM KFU.FR, University of Missouri, Columbia, MO 
HOSSEM TAHANI, University of Missouri, Columbia. MO 


10:15 - 11:45 Parallel Sessions 



On the Evaluation of Fuzzy Quantified Queries In a Database Management System 
PATRICK BOSC, IRISA/ENSSAT, Lannion, Cedex, FRANCE 
OLIVIER PIVERT, IRISA/ENSSAT, Lannion, Cedex, FRANCE 


A Fuzzy Case Based Reasoning Tool for Model Based Approach to Rocket Engine Health 
Monitoring 

SRINIVAS KROWIDY, University of Cincinnati, Cincinnati. OH 
ADAM NOLAN, University of Cincinnati, Cincinnati, OH 
YONG UN HU, University oi Ondnnall, Cincinnati, OH 
WILLIAM G. WEE, University of Cincinnati, Cincinnati, OH 

A High Performance, Ad-Hoc Fuzzy Query Processing System for Relational Databases 
W.H. MANSFIELD, BeBcore, Cambridge, MA, USA 
ROBERT M. FLEISCHMAN, BBN, Cambridge, MA. USA 



Genetic Algorithms In Adaptive Fuzzy Control 

C. LUCAS KARR, U. S. Department of Interior Bureau of Mines, Tuscaloosa, AL 

A Genetic Algorithms Approach for Altering the Membership Functions In Fuzzy Logic 
Controllers 

HANA SHEHADEH, UnCom Corporation. Houston, TX 
ROBERT N. LEA, NASA/Johnson Space Center, Houston, TX 

Fuzzy Multiple Linear Regression • A Computational Approach 
C.H. JUANG, CJemson University. Clemeon, SC 
X.H. HUANG, Clemson Universly. Ciemson, SC 
J.W. FLEMMG, Clemson University, Clemson, SC 


xii 


rr 






"wyr 




12:00 - 1:00 
1:00 - 4:30 


Lunch 

Parallel Sassiona 


incorporation of of ,n • N#ur#l N4!w0rk 

mTe COHEN, CaBomia Slate University, Frecno, CA 
0.L HUDSON, CaVomia State University, Fresno, CA 

tliwl nnanfari and Cvcllc Behaviour In Formal Neural Networks 
LASOS. Semmehwei8 University Medical ^^{P ******' HUNGARY 
A. v. HOLDEN, The University ol Leeds, Leeds, UK 
j. LACZKO, Ludwig Maximlien Universty. ^n diw.G ^MANY tia _._ v 
A. S. LABOS, Semmelwete University Medical School. Budapest, HUNGARY 

Neural Networks: A Simulation Technl^ie Under Uncertainty Conditions 
LUISA MCALLISTER, Moravian College, Bethlehem, PA 

incomplete Fussy Data Processing Using Artificial Neural Network 
MAREK J. PATYRA, University of Minnesota, Duluth, MN 

«« ££ *£*«* 

EUGENE ROVENTA, York University, Toronto. Ontario. CANADA 
A Conjugate Gradients/Trust Regions Algorithm for Training Multilayer Perceptrons for 

N ° n FLwa?AVB«5Sl X? MADYASTHA, Rice University,Houston, TX 
BEHNAAM AAZHANG, Rice University, Houston/TX 
TROY F. HENSON, IBM Corporation, Houston, TX 

WENDY L HUXHOLD, IBM Corporation, Houston, TX 






tuArv t Su 


On Probability-Possibility Transformations 

GEORGE KUR, State University of New York, Brnghamton, NY 
BEH2AD PARVE, California State University, Los Angeles, CA 

inference in Fuszv Rule with Conflicting Evidence 

LASZLO T. KOCZY, Technical University of Budapest, Budapest, HUNGARY 

Gaussian Membership Functions are Most Adequate In Representing Uncertainty In 
Measurements 

V. KREINOVICH. University of Texas at B Paso, El Paso, x 

C. QUINTANA, University ol Michigan at Ann Ari»r, Ann Arbor, Ml 
L. REZNIK, Victoria University of Technology, MMC Meboume, 

VIC 3000. AUSTRALIA 

Applying the Metric Truth Appro ach t o F ua Hlsd Automated Reasoning 
VESA A. NtSKANEN, University o! Helsinki, Helsinki, FINLAND 

Life Insurance Risk Assessment Using a Fuzzy Logic Expert System 
L. A. CARRENO, Togai InlraLogic. Houston, TX 
R. A. STEEL, Togai InlraLogic. Houston, TX 




CONTENTS 


An Analysis of Posstole Appficalions ot Fuzzy Set Theory to the Actuarial CredfciBy Theory 

Estimations of Expectedness and Potential Surprise in Possfeilty Theory 

Comparison of Specificity and Information for Fuzzy Domains — 7 .. 

The Axiomatic DefinRion of a Linguistic Scale Fuzziness Degree, Its Major Properties and Applications — 

How to Select Combination Operators tor Fuzzy Expert Systems Using CRI 

Approximate Reasoning Using Terminological Models 

Quantitative Analysis of Properties and Spatial Relations of Fuzzy Image Regions 

A Fuzzy Clustering Algorithm to Detect Planar and Quadric Shapes 

A Fuzzy Measure Approach to Motion Frame Analysis for Scene Detection — 

Automatic Rule Generation for High-Level Vision .r. - 

Encoding Spatial Images - A Fuzzy Set Theory Approach. - — 

Image Segmentation Using LVQ Clustering Networks 

A Neuro-Fuzzy Architecture for Real-Time Applications 

A Composite Self Tuning Strategy for Fuzzy Control of Dynamic Systems 

A Self-Learning Rule Base for Command Foflowing in Dynamical Systems 

Adaptive Defuzzification for Fuzzy Systems Modeling — 

Design Issues of a Reinforcement-Based Self-Leaming Fuzzy Controaertor Petrochemical Process Control .. 

Learning Characteristics of a Space-Time Neural Network as a Tether "Skiprope Observer*' 

Clustering of Tethered Satellite System Simulation Data by an Adaptive Neuro-Fuzzy Algorithm 

Character Recognition Using a Neural Network Model with Fuzzy Representation.. 

Designing a Fuzzy Scheduler tor Hard Real-Time Systems 

WARP: Weight Associative Rule Processor, A Dedicated VLSI Fuzzy Logic Megacell - 


Evaluation of Fuzzy Inference Systems Using Fuzzy Least Squares 

A Model for Amalgamation in Group Decision Making 

Fuzzy Forecasting and Decision Making in Short Dynamic Time Series . 
Decision Analysis With Aproximate ProbabBities 




Distributed Traffic Signal Control Using Fuzzy Logic... — 

InteIBgent Virtual Realty in the Setting of Fuzzy Sets 

Comparison of Crisp and Fuzzy Character Networks in Handwritten Word Recognition.. 
Fuzzy Neural Network Methodology Appled to Medical Diagnosis 


1 
7 
14 
21 
29 
39 
49 
59 
et> 
81 
89 
98 
108 
118 
125 
. 135 
. 143 
154 
. 166 
. 175 
. 185 
. 195 
. 205 
.. 215 
.. 224 
.. 229 
.. 239 
.. 248 
.. 257 
... 268 


"TO 



iNJSMiOJMUJ vrtwk- 


xv 


An Experimental Methodology for a Fuzzy Set Preference Model 

A Fuzzy Set Preference Model tor Market Share Analysis 


276 

286 


Information Compression in the Context Model 

Puzzy Knowledge Base Construction Through Befief Networks Based on Lukasiewicz Logic 

Intelligent Fuzzy Contra ler for Event-Driven Real Time Systems 


296 

304 

312 


Fuzzy Coordkiator in Control Problems— «»««»< 

Tuning a Fuzzy Controller Using Quadratic Response Surfaces 

The CognHve Bases for the Design of a New Class of Fuzzy Logic Controleis: The Clearness Transformation 
Fuzzy Logic Controller — — ~ — . — - — 

A Fuzzy Control Design Case: The Fuzzy PLL — 


322 

330 

340 

350 


Addtog Dynamic Rules to Sett-Organizing Fuzzy Systems 

Fuzzy Learning Under and About an UnfamBar Fuzzy Teacher «. 

Some Problems With the Design of SeV-Leaming Management Systems 

A Neural Fuzzy Controller Learning by Fuzzy Error Propagation 

Determining Rules for Closing Customer Service Centers: A Pubic LW*y Company's Fuzzy Decision 

Fuzzy Simulation in Concurrent En g i neering 

Inverse Problems: Fuzzy Representation of Uncertainly Generates a Regularization 

Quantification of Human Respons e s — 

Non-Scalar Uncertainty 

Comparison Between the Performance of Two Classes of Fuzzy Controflers 


360 1 

36 

378 '; 

388" *1 



_ 406 c- 
418*7 


427-& 


437* 

449*7* 


PossbBistic Measurement and Set Statistics 4587 > 

The Fusion of Information Via Fuzzy Integration 468 ^ 

On the Evaluation of Fuzzy QuantSed Queries in a Database Management System 478 ^ 

A Fuzzy Case Based Reasoning Tool for Model Based Approach to Rocket Engine Healh Moritoring 488' 1 ") 

/ 

A High Performance, Ad-Hoc. Fuzzy Query Processing System for Relational Databases 496'^ 

Genetto Algorithms in Adaptive Fuzzy Control — 506^ 

A Genetic Algorithms Approach tor Alering the Members h ip Functions in Fuzzy Logic ContraMers — 515 K 

Fuzzy MuKipie Linear Regresssion - A Computational Approach — 524'*'-’ 

Incorpora ti on of Varying Types of Temporal Data in a Neural Network — 535'^ 


Fuzzy Operators and Cycfic Behavior in Formal Neuronal Networks — 


- 545;- 


Neural Networks: A Simulation Technique Under Uncertainty Conations — 


555y\ 


Incomplete Fuzzy Data Processing System Using Artificial Neural Network... 

xvi 




Stochastic Architecture tor Hopfield Neural Nels 

Hierarchical Model of Matching 

A Conjugate Gradientanriust Regions Algorithm for Training Multlayer Perceptrons tor Nonlinear Mapping .. 



Inference In Fuzzy Rute Bases with Conflicting Evidence — — • — - — 

Gaussian Membership Functions Are Most Adequate in Representing Uncertainty in Measurements 

Applying the Metric Truth Approach to FUzzHied Automated Reasoning — 

Lie Insurance Risk Assessment Using a Fuzzy Logic Expert System - 

Author Index — •••••• • 


575 >13 
581 

589 -^ 5 ^ 

S9B~^C> 

608 ~ J -7 

618 '5 S 

625 

627 ' 

A-1 


xvii 







■3/-c>7 


N93-29557 

r-o 

Adding Dynamic Rules to Self-Organizing Fuzzy Systems 


Catalin V. Buhusi 

Romanian Academy, Institute for Computer Science, 
Calea Copou nr. 22 A, IASI 6600, ROMANIA 


Abstract 

This paper develops a Dynamic Self-Organizing Fuzzy System (DSOFS) capable of adding, 
removing and/or adapting the fuzzy rules and the fuzzy reference sets. The DSOFS background 
consists in a self-organizing neural structure with neuron relocation features which will develop 
a map of the input-output behaviour. The relocation algorithm extends the topological ordering 
concept. Fuzzy rules (neurons) are dynamically added or released while the neural structure 
learns the pattern. The DSOFS advantages are the automatic synthesis and the possibility of 
parallel implementation. One could remark a high adaptation speed and the reduced number of 
neurons needed in order to keep errors under some limits. The computer simulation results in 
a nonlinear systems modelling application are shown. 

keywords: fuzzy systems, neural networks, neuron relocation, Kohonen self-organizing 
procedure, LMS procedure, feature map, basin of attraction, lateral feed-back. 


I. Introduction 

The promising link between the fuzzy 
reasoning and the massively parallel 
calculus, i.e. fuzzy neural networks, became 
an important topic of the fuzzy systems 
research during the last years. 

Learning on membership functions and the 
fuzzy rules are major problems in 
synthesizing a fuzzy system. In this paper 
we are interested in the automatic synthesis 
of reference sets and fuzzy rules. One of the 
classes of fuzzy systems which gives a 
solution to these problems is based on 
self-organizing neural structures which map 
the desired topological relations between the 
fuzzy system input and output. Some 
solutions are briefly discussed in section 111. 

This paper presents a Dynamic Self- 
Organizing Fuzzy System (DSOFS) capable 
of adding, adapting and/or removing the 
fuzzy rules and the reference fuzzy sets. The 
fuzzy system synthesis is based on a 


modifiable adaptive neural network using a 
neuron relocation algorithm as a learning 
method. This algorithm extends the 
topological ordering concept [4,7]. In the 
adaptation process neurons are added and/or 
disposed while learning the pattern. This is 
the neural equivalent of modifying the fuzzy 
system rules. The relocation algorithm 
supposes for every fuzzy rule (neuron) a 
basin of attraction as a base for the fuzzy 
reference sets construction. 

n. The Dynamic Seif-Organizing 
Fuzzy System Definition 

In order to fix the ideas we will denote by 
R" the input universe of discourse and by 
R" the output universe of discourse, where 
n and m are fixed integers. 

The DSOFS input and output are vectors 
in Rx...xR. We will denote such vectors as 
X, Y, or {X|, Xj x 0 }, {y„ y 2 yj. 

The rules of the DSOFS have the 


360 


following form: 

if X is X' then Y is Y r with b r (1) 

where XGR" is the input vector; Y€R m is 
the output vector; X r €ER“ is the input 
reference vector for the rule number r; 
Y r € R m is the output reference vector for the 
rule number r; b r is the basin of attraction of 
the rule number r, b r €R + . 

The truth degree w r of any rule r is given 
by: 

w r = F[bQ( d(X, XO ) (2) 

where b r is the basin of attraction of the rule 
number r; d(*,*) is the Euclidian distance; 
F[b1(*) is a family of functions of 
parameter b r such that: 

(i) F[bT : R + -*R, Vb'€R + ; 

(ii) F[bT is monotone decreasing Vb r €R + ; 

(iii) F[b1<0) = 1, Vb r €R + ; 

(iv) if bj>bj then F[b i ](z)>F[b J ](z), Vz€R + 
and bj,bjGR + 

The fuzzy system output Y is computed 
via: 

N N 

yj = Ew r *y r j / Ew r , i — l..m (3) 

r=l r= 1 

where N is the number of rules; w r , y r have 
their previous meanings. 

ID. The Dynamic Self-Organizing 
Fuzzy System Synthesis 

A fuzzy system synthesis has to solve two 
problems: construction of the fuzzyfier, i.e. 
obtaining the reference fuzzy sets of the 
system, for both input and output, and 
construction of the fuzzy rules. These 
problems find particular solutions when 
fuzzy reasoning is linked on neural 
networks, and especially on self-organizing 
neural networks. 

Yamaguchi et al. [13] proposed 


unsupervised learning the membership 
functions using the Learning Vector 
Quantization procedure [8], and the if-then- 
rule part using Bidirectional Associative 
Memories (BAMs) to show the relationships 
interpreted from fuzzy rules. Another 
approach made by Takagi and Hayashi [1 1] 
is using two kinds of neural networks, for 
the membership functions and for the fuzzy 
system output, networks whose adaptation 
and optimization are made by clustering 
algorithms. Bezdek proposed recently [1] a 
fuzzy Kohonen self-organizing system, an 
approach linking the Kohonen self- 
organizing procedure and fuzzy systems. 
Such a link was also proposed in [2]. 

One of the backdraws of the Kohonen self- 
organizing procedure, and of others 
clustering algorithms as well, is that the first 
engineering decision to be made is how 
many nodes should be used. 

The Dynamic Self-Organizing Fuzzy 
Systems solve all these problems, based on 
a self-organizing neural network with neuron 
relocation features. Through the "learning" 
stage, the fuzzy rules are changed by 
adapting both input and output reference 
vectors and their basins of attraction. If 
necessary, new rules will be added and/or 
the old ones removed. The output of the 
fuzzy system will be therefore refined 
through an adaptation algorithm. This 
adaptation is made such that the energy of 
the difference between the DSOFS output 
and a desired sequence of outputs is 
minimized. The used adaptation algorithm is 
the well-known Least Mean Square (LMS) 
algorithm for adaptive linear combiners 
[ 12 ]. 

m.l. The Neuron Relocation 
Self-Organizing Procedure 

A neural network implements the 
behaviour rules in the net weights. The 


361 



DSOFS 

. > output y 



output reference fuzzy sets Vr 


Neural Relocation 
Self-Organizing 
Network 


input reference fuzzy sets Xr 



input X 


pig. i The DSOFS and the Neural Relocation Network 


adaptation algorithm proposed in [4,7] by 
Kohonen is based on lateral feed-back 
concept. Networks using this biologically 
motivated process will behave such that 
network outputs form clusters around the 
excitation input local maxima. Such a neural 
structure supposes constant number of 
neurons, free of the information conveyed 
by the pattern, [4], in the sense of the 
topological distribution. Thus, a network 
with a given number of neurons could 
obviously hold less information for a 
nonuniform input distribution in opposition 
to a uniform one. 


Self Organizing Neuron 9 
Relocation Procedure 
for N cluatere 

Neural net has N neurons 
corresponding to the fuzzy 
system rules, and n+m 
inputs corresponding to 
fuzzy reference vectors 


xl x2 ... xn yl y2 ... ym 
input vector output vector 

Pig. 2 The Self-Organizing Neuron 
Relocation Procedure (Block Scheme) 

On the contrary, a dynamical neural 
structure [3] could be distinguished by a 
neuron adding-releasing character as an 
aspect of the pattern novelty features. The 
DSOFS input-output mapping will be 


obtained via a self-organizing neuron 
relocation procedure which adapts the 
number of neurons (fuzzy rules), the weights 
and the neurons basin of attraction, forming 
clusters around the best matching neuron 
(fuzzy rule). We further propose a clustering 
algorithm which increases the adaptation 
speed and adapts the required number of 
fuzzy rules. 

We will work with a neural net containing 
a variable number of neurons, equal to the 
number of fuzzy rules of the DSOFS, i.e. 
N. The behaviour of the DSOFS consists in 
the pairs {X r , Y r } and in their basin of 
attraction b r . In the adaptation stage the 
input-output pair [X, Y} will feed the neural 
network which will map the input-output 
behaviour of the DSOFS in the net weights. 
These weights are the reference vectors of 
the fuzzy system {X', Y r }. 

In the followings we will use the Euclidian 
distance as a measure of similarity. 

The relocation algorithm is based on a 
dynamical neuron allocation in terms of the 
input distribution specificity. Therefore, we 
propose the insertion of a new rule, i.e. of 
a new neuron, while the input is outside the 
basin of attraction of every rule in the actual 
set of rules. A rule will be removed if it is 
inside the basin of attraction of another rule. 
If none of the above, the rule adaptation 
process continues in order to build the 
reference fuzzy vectors, i.e. the neural 
feature map. 


362 





We will denote by N the variable number 
of neurons, each one represented by the 
radius of the basin of attraction b r , varying 
between two fixed limits B^ and B^, and 
n+m connection weights h ir between the 
input i, i=l..n+m, and the r-th neuron. In 
fact, every ordered set {h lr , ..., h B+mr } may 
be regarded as a kind of image that shall be 
matched against the input vector {x,, ...x„, 
y 1 » •••» Ym} 

a. Initialize Structure 

Initialize N with N 0 , b r with B^, h ir with 
small random values. 

b. Develop Background 

Train the network in order to produce the 
feature map formation via the successive 
presentation of some n+m samples from the 
pattern, breaking the process before the 
convergence phase, [4]. This step gives a 
background to the network for future 
adaptation and avoids an insertion explosion 
(see section IV). 

c. Present New Sample 

Present input vector Z-{X, Y} and 
compute the Euclidian distance to the N 
neurons: 

n+m 

d(j) = sqrt( E (h^) 2 ), j~l..N (4) 

d. Select Best Matching Neuron 
Select the neuron k such that: 

d(k) = min {d(j)}, j = 1..N (5) 

e. Insert New Neuron 

If d(k)>B nuu then insert the p=N+l -th 
neuron such that: 

hjp = zj, i=l..n+m (6) 

Update the number of neurons N, and repeat 
by going to step c. 

f. Adapt Network 

If b k <d(k)^B m>x then adapt the network in 
order to yield the characteristic feature map 
(7). Then repeat by going to step c. 


The weights will assume new values in the 
process formally specified by: 

dhjj/dt = f(t)*(Zj-hjj), i=l..n+m, j€Nb(k,t) 
dhj/dt = 0, otherwise (7) 

where k denotes the neuron with the best 
matching between the input {X, Y} and the 
weights; Nb(k,t) denotes a time decreasing 
neighborhood of the k-th neuron; f(t) 
denotes a slowly decreasing function of 
time,' determined by experience, 

g. Release Neuron 

If d(k)<b k then increment the basin of 
attraction of the best matching neuron: 

b k = b k + e, e>0 (8) 

Verify if some neuron is inside the basin of 
attraction of the k-th neuron’s and if 
n+m 

sqrt( E (z k j -z p i ) 2 ) < b k , 1 <p<N, p^k (9) 
i = l 

remove neuron p and update the number of 
neurons. Repeat by going to step c. 

• •• 

This algorithm will provide the fuzzy 
rules, i.e. the neurons with their basin of 
attraction, and the fuzzy reference vectors, 
i.e. the pairs (X r , Y r } consisting in the 
weights h ir of the network. 

m.2. The DSOFS Refined 
Synthesis via LMS Adaptation 
Procedure 

The synthesis of the fuzzy system via the 
self-organizing neuron relocation gives only 
a mean estimation of the pairs {X r , Y'}. The 
reference fuzzy sets X r and Y r may be 
considered satisfactory and the computer 
simulations showed that the basic properties 
of the input-output behaviour are well 
preserved by neural learning. A great 


363 




improvement may be obtained by adapting 
the output reference fuzzy sets such that the 
fuzzy system output will reach the desired 
output [2]. A classical adaptation algorithm 
is the LMS procedure. The block scheme of 
the LMS adaptation is depicted in Fig.3. 

Suppose that we have at the moment k, 
D(k)€R" the desired output vector, and 
Y(k)€R m the actual fuzzy system output. 
The error vector E(k) is given by: 

E(k) - D(k) - Y(k) (10) 

The LMS adaptation rules are: 

yi (k+l) = y s (k) + M* ei (k)*a r (k), 

i=l..m, r=l..N (11) 

where p is the adaptation factor; k is the 
iteration number; a r f '< is given by: 

N 

of'(k) = w'(k) / E w r (k), r= 1 . . N (12) 

r=l 

where w'(k) is the truth degree of the fuzzy 
rule number r. 

IV. About The Neuron Relocation 
Algorithm 

The neural relocation algorithm presented 
above may be matched against other 
clustering algorithms. 

The complexity level is the same as 
Kohonen’s procedure. Nevertheless, after 


the developing background stage, in the 
neural relocation algorithm the "learning by 
insertion” method will reduce the time 
needed in order to adapt the network, 
because it is worth copying the input-output 
behaviour than adapt the weights. If a fuzzy 
rule is no longer needed (i.e. the rule is 
inside the basin of attraction of another rule) 
it will be removed, so the complexity 
diminish. 

Of course, in the above algorithm could be 
inserted a number of neurons equal to the 
number of iterations. This can be balanced 
by changing the maximum basin radius B^. 
The maximum radius is important in 
both minimizing the errors and the number 
of neurons. The minimum basin radius 
may be chosen to be null or a small pos.' e 
value, contributing only at the convergence 
time. 

The simulation results showed the 
background developing stage to be very 
important in the neuron cost. If this stage is 
overstepped, the adaptation will evolve such 
that the initial iterations will add random 
neurons and it will take some time to 
remove some wrong positioned ones. This 
stage will form a basis on which the 
relocation features act well. The initial 
number of neurons N„ has a similar 
importance in the neuron cost. It can be 
null, but this is not recommended. 

The Grossberg ART net [S] has also the 
feature of inserting neurons based on a 
parameter called vigilance. The same 


364 




( 13 ) 


problems (adding too many nodes or having 
too little discrimination) arise. In opposition 
to ART, the proposed model uses a 
parameter controlling the insertion of 
neurons ,i.e. B„, and also a number of N 
parameters controlling the adaptation and 
deletion of the neurons (fuzzy rules), i.e. b r , 
r=l..N. The parameter is fixed as the 
vigilance is in the ART model, but the b r 
are adaptive parameters. 

Our simulations showed the time needed to 
adapt the net to be at least 25% lower than 
the Kohonen self-organizing method! and the 
number of neurons needed in order to 
represent the input-output behaviour 
diminished at about half. This can be 
explained by the "insertion" effect and the 
radius of attraction which can substitute 
neurons. 

V. An application: Nonlinear 
System Modelling 

We have applied DSOFS in a nonlinear 
system modeling application. This problem 
it is really suited to the DSOFS. It involves 
a model and a fuzzy system which will 
"learn" the behaviour of the model. The 
input and the output of the DSOFS are 
suited as the same dimensions as the model. 
In the first step of the synthesis a neural 
network "learns" the behaviour of the 
model. This phase will give us the reference 
vectors, the number of fuzzy rules and their 
basins of attraction. The output reference 
vectors will be adapted through LMS 
procedure in the second step of the synthesis 
in order to obtain a better resemblance to 
the model. 

In the computer simulations that we further 
present we have used a multi-input single 
output nonlinear model with the input-output 
behaviour depicted in Figure 4, described 
by: 


y = exp(-x, 2 -x 2 2 ), 

x,€[-2, 2], x 2 €[-2, 2] 



The DSOFS has n=2, m= 1 and rules of 
the form: 

if X is X r then y is / with b r , r=l..N (14) 

where X « {x„ x 2 }, X€R 2 ; y6R; b'€R+. 

In the self-organizing step of the synthesis 
we have used a neural network of inputs x„ 
x 2 and y, with N„=40 neurons (fuzzy rules). 
We have stopped the preliminary adaptation 
process before the 1500-th iteration 
(developing background step). Afterwards, 
we have successively presented samples 
from the pattern according to the neural 
relocation algorithm proposed above. 

The similarity measure that we have used 
in our computer simulations was of the 
following form: 

d({x,. x 2 , y}, (h Ir , h 2r , h Jr }) = a 2 *(x,- 
h lr ) 2 +a 2 *(x 2 -h 2r ) 2 +b 2 ’-(y-h 3r ) 2 (15) 

where b> a, a,b€R + . 

The simulations results also showed that 
the network have the tendency to add less 
neurons while the process continues, as an 
effect of increasing the basins of attraction 
of the fuzzy rules up to the maximum basin 
radius. In Fig.5 it is depicted an example of 
the distribution of the rules in a step of the 
neural relocation self-organizing procedure. 
Every rule r is represented by a circle with 
the center (x,\ x/) and the radius b r . 

After the neuron relocation self-organizing 


365 



T 



Fig.5 The Spatial Distribution of the Basins 
of Attraction 



Organizing Procedure 


procedure we have obtained N=51 rules 
{x r ( , x r 2 , y'} consisting in the network 
weights {h lr , h*, h Jr },i = l..N. In Fig.6 it is 
depicted the output of the fuzzy system after 
5000 iteration of the self-organizing 
procedure. We can note the well topological 
resemblance to the model (including the 
symmetries). 

The truth degree of the rule r was 
computed by: 

w r (x„Xj) = gitXrx'^g^-x^gjfbO (16) 

where g„ g 2 , g 3 are gaussian-like functions. 
These vectors became a background for 



Adaptation Procedure 


the second step: LMS adaptation of the 
output. In the LMS adaptation procedure we 
redraw the output reference fuzzy sets y' in 
order to obtain better results. In Fig.7 it is 
depicted the output of the fuzzy system after 
1000 iterations of the LMS procedure. 

VI. Conclusions 

The Dynamic Self-Organizing Fuzzy 
Systems have some major advantages based 
on rules adding/removing features and the 
reference fuzzy sets adaptation: (1] 

automatic synthesis based on neuron 
relocation self-organizing procedure and the 
LMS adaptation; [2] the possibility of 
parallel implementation 

The DSOFS background consists in a self- 
organizing neural network with neuron 
relocation features. The neural equivalent of 
adding/removing rules is relocation of the 
neurons. According to the proposed 
clustering algorithm, neurons (fuzzy rules) 
are relocated and the fuzzy reference sets 
for both input and output are adapted in 
order to develop feature map formation. One 
could remark a higher adaptation speed and 
the reduced numbers of neurons in 
comparison with Kohonen’s self-organizing 
model. 

These advantages impose them in the 
problems involving modelling, automatic 







fuzzy system synthesis and adaptation. They 
can be both used in the developing stage of 
other fuzzy systems and in self-sustained 
applications. 


References 

[1] Bezdek.J.C., Tsao.E.C.K, Pal,N.R; 
Fuzzy Kohonen Clustering Networks, EUZ* 
IF.FTT92 - San Diego California, 1992; 

[2] Buhusi.C.V., Chelaru.M., Dragoi, V.; 
Self-Organizing Fuzzy Systems, IqL Euszy 
Svst. and Artif. Intell. Syjapi, Iasi, 
Romania, 1991; 

[3] Buhusi.C.V., Dragoi, V.; Neural 
Structures with Neuron Relocation Features, 
INFO - 1 AS1 Computer Science Syoul Eiqc,, 
1991; 

[4] Kohonen ,T.; Self-Organization and 
Associative Memory, Springer-Verlag, 
Berlin, 1984; 

[5] Grossberg.S.; The Adaptive Brain, 
Elsevier/North-Holland, Amsterdam, 1986; 

[6] Kang.G., Sugeno,M.; Fuzzy Modelling, 
Trans, of SICE, 1987; 

[7] Kohonen, T. ; Internal Representation and 
Associative Memory, Elsevier Science 
Publishers, 1990; 

[8] Kohonen.T.; Learning Vector 
Quantization for Pattern Recognition, Tech, 
Report of Helsinki link,., 1986; 

[9] Pedrycz.W.; Fuzzy Control and Fuzzy 
Systems, Research Studies Ll$L, John Wiley 
& Sons, 1589; 

[10] Pedrycz.W. ; Tutorial on Fuzzy Systems 
for Pattern Recognition, IIZUKA 1988; 

[11] Takagi , H . , Hayashi.I; 
Artificial_Neural_Network - Driven Fuzzy 
Reasoning, Int. Workshop on Fuzzv Syste ms 
Add. . IIZUKA 1988; 

[12] Widrow.B., Steams,S.; Adaptive Signal 
Processing, Prentice Hall, 1885; 

[13] Yarnaguchi.T., Tanabe.M., 
Kuriyama.K., Mita,T.; Fuzzy Associative 
Memory Application to Adaptive Control , 
Proc. of IFSA 1991. 


367 



UNCLAS 



N»8-*9558^^ 


FUZZY LEARNING UNDER AND ABOUT AN UNFAMILIAR FUZZY TEACHER 

Bdur V. Dasaraihy 
Dyne tics, Inc. 

P.O. Drawer ’B' 

Housville, Ai 35814-5050 



ABSTRACT 


This study addresses the problem of optimal parametric learning in unfamiliar fuzzy environments. Prior 
stadia in the domain of unfamiliar environments, which employed either crisp or fuzzy approaches to model the 
i i mrna u Hy or imperfectness of the learning environment, assumed that the training sample labels provided by the 
tsuCaaSnr teacher were crisp, even if not perfect Here, the more realistic problem of fuzzy learning under an unfa- 
tmffiamcher who provides only fuzzy (instead of crisp) labels, is tackled by expanding the previously defined fuzzy 
imrmfcBrrtiip concepts to include an additional component representative of the fuzziness of the teacher. The previ- 
ously audied scenarios, namely, crisp and fuzzy teaming under (crisp) unfamiliar teacher, can be looked upon as spe- 
cial cares of this new methodology. As under the earlier studies, the estimated membership functions can then be de- 
gfloyef daring the ensuing ^Lsrification decision phase to judiciously take into account the imperfectness of the 
flunmg environment The study also offers some insight into the properties of several of these fuzzy membership 
(h n nh i estimators by cr amining their behavior under certain specific scenarios. 

1. INTRODUCTION 

Probabilistic decision making in imperfectly supervised environments, i.e., scenarios wherein the labels of 
dfae grieu training samples are unreliable, has been extensively studied in the literature over the past two decades [1- 
7Q. Typical of the learning models proposed are: probabilistic teacher [1], imperfect teacher [2, 3], unfamiliar teacher 
[Ml aoiVEDIC teacher [5]. A couple of fuzzy models [6, 7] have also been proposed recently. The probabilistic 
teacher approach proposed by Agrawala [1], which represented a start of a whole new line of studies, essentially disre- 
gards De given unreliable labels, i.e„ treats the imperfect environment as unsupervised and uses a probabilistic label- 
ling scheme to learn the underlying parameters Gar the design of the classifier. On the other hand, the imperfect 
acachermodel proposed by Shanmugsm [3] assumes that a precise knowledge of the level of imperfection (P) in the 
uawMo am ea t is available a priori and uses this information to guide the parameter learning. This improves the qual- 
ity of Semiring over the probabilistic model only as long as the underlying assumption is valid, i.e., the level of im- 
pel stabs assumed is close to the reality. Otherwise, the resultant learning under the imperfect teacher is likely to 
the wane than under die probabilistic teacher, which, in essence, assumes P = 0.5 for a two-class problem, or in a 
more gareral case, P = 1/tat, where m is the number of known pattern classes in Uie environment 

The unfamiliar teacher scheme reported by Dasaraihy and Lakshminarasimhan [4] avoids both of these com- 
gflem ema r y problems, of either having to disregard the imperfect labels entirely and lose some useful information or 
making a possibly wrong assumption on the inq sfectness level and thereby biasing the learning process. This is 
.■rrnmpiihrd by viewing the environment initially as unknown (ix., starts the learning process in much the same 
manners the probabilistic teacher scheme, with P as 1/m) and then learning P about the environment simultane- 
mnrfywGfo the teaming of the parameters for the cfassifier system design. This learning about the teacher has been 
sfonan maid and enhance the learning under the trai ner (see Figure 2 of reference [4]). This approach was extended 
ByDasaa fcy and Lakshminarasimhan [5] to d ynam i c scenarios using the VEDIC teacher model, wherein the level of 
iimp et fo ai u n P, in addition to being unknown, is also changing with time. 

Recently, fuzzy models [6, 7] were proposed to effectively capture the uncertainties caused by the imperfect- 
obss of tiese crisp teachers. Underlying these fuzzy techniques is the need to define a fuzzy membership matrix for 
tfoeghca mining data set. Various approaches hare been proposed for learning these membership functions. The 
ufoyecureaf the study being reported line is to adapt this novel concept of fuzzy learning under an unfamiliar teacher 
RPdbe pmUem of learning in an environment that is not only unfamiliar, ix. labeling information is of unknown 
fcwef of aefiability, but also fuzzy, i.e., the given mining samples are associated with multiple classes (rather than 
jjrnt one) with membership distributed across the pattern classes. The fuzzy membership functions to be learnt dur- 
ing the taming phase reflect not only the inherent imperfectness but also the fuzziness in the class association pro- 
visted by foe unfamiliar fuzzy teacher. These fuzzy memberships can then be used in the classification phase to ap- 
f lfnaiefy bias the decision process. Details of dm integration of the concepts of fuzzy teaming under an unfamil- 
tas’xrispcacher with those of a fuzzy teacher are presented in the sequel. Section 2 briefly reviews the basic crisp 


368 


learning under an unfamiliar crisp teacher. Section 3 providesa short overview of fuzzy learning under an unfamiliar 
but still crisp teacher. In section 4, the adaptation and extension of these ideas to the scenario of fuzzy learning un- 
der an unfamiliar fuzzy teacher is presented. The associated algorithmic procedure is outlined in section 5 to aid the 
implementation of the new methodology. Secu^n 6 outlines some potential alternatives to the initially proposed 
fuzzy membership model. The last section offers some concluding comments. 

2. CRISP LEARNING UNDER AN UNFAMILIAR CRISP TEACHER 

The intuitively appealing concept, of learning about an unfamiliar teacher as an aid to learning under the 
teacher, that underlies this study, was first proposed and successfully demonstrated by Dasatathy and 
Lakshminatasitnhan [4] in 1976 in a two-class crisp environment They showed that this learning under an unfamil- 
iar teacher is indeed an efficient and practical tr*)l for learning in imperfectly supervised environments wherein it is 
unrealistic to assume lha. the level of imperfectness is known a priori , the basis of earlier studies in this area. This 
dual learning process, of learning about the teacher concurrently with parametric teaming under the teacher, is 
schematically illustrated in Figure 1. 



Figure 1. Crisp Learning in Unfamiliar Crisp Teacher Environments 


Here the learning about the teacher consists in learning 0, the effective level of imperfectness in the labels 
provided by the teacher (environment). This is modeled as a Bernoulli trial with parameter 0 and a Bayes estimator 
for minimum quadratic loss, which has a beta distribution (8], is set up for the estimation of 0. The learning under 
the teacher consists in learning the parameters of the underlying distributions which is essential for classification in 
the operational phase, the primary objective of the effort It is to be noted that this learning scheme [4] in essence 
encompasses the spectrum of learning scenarios, starting from learning with a perfect teacher (0 = 1) up to learning 
without a teacher or learning with a probabilistic teacher (0 = 1/m) through learning with a known imperfect teacher 
(ix., 0 , 1/m £ 0 £ 1 is known a priori) and ultimately learning under the most realistic of these scenarios namely 
learning with an unfamiliar teacher, ix., 0, (1/m £ 0 £ 1) is unknown a priori and is learnt simultaneously with 
pa ra met ri c learning. Further implementation details of this learning process can be gleaned from (4) and, as such, are 
not presented here to save on publication space. 

3. FUZZY LEARNING UNDER AN UNFAMILIAR CRISP TEACHER 

The unfamiliar teacher scheme discussed in the previous section was synergistically combined recently [6] 
with the now-well-understood concepts of fuzzy membership to derive a potentially powerful tool of fuzzy learning 
in unfamiliar teacher environment This integrated learning is schematically illustrated in Figure 2. Here, the learn- 
ing includes, not only the distribution parameters and the imperfectness level (as outlined in the previous section), 
but also the fuzzy membership values generated for each of the input samples by the fuzzy modeling of the uncer- 
tainties in the learning environment This synergism permits the user to exploit the benefits of both the unfamiliar 
teacher hypothesis as well as those of fuzzy learning concepts. The algorithmic and other details of this integrated 
scheme of learning, as well as the associated fuzzy membership models, their alternatives and properties, being read- 
ily available in the study published recently (6), are not repeated here in the interest of conservation of publication 
space. 


369 









Figure 2. Fuzzy Learning in Unfamiliar Crisp Teacher Environments 
4. FUZZY LEARNING UNDER A UNFAMILIAR FUZZY TEACHER 

The scheme of fuzzy learning under an unfamiliar teacher, outlined in the previous section, assumed that the 
labels provided by this unfamiliar teacher, were crisp, even if imperfect. However, in real-world environment, the 
imperfect teacher is likely to be fuzzy also. The previously reported fuzzy model [6], which was postulated to take 
into account only the iniperfecuiess of the teacher, had no provision for taking into consideration the fuzziness in the 
teacher behavior. Accordingly, a more generalized fuzzy membership model, viewed as the sum of two weighted 
components is proposed here. This new learning process is schematically shown in Figure 3. 



Figure 3. Fuzzy Learning in Unfamiliar Fuzzy Teacher Environments 


The new fuzzy membership function, in effect, captures both the imperfectness and fuzziness of the unfa- 
miliar teacher environment during the learning phase. This is then used to correspondingly weight the decisions 


370 




made in the classification phase. This is the central idea of the methodology presented here in this study. Hwrecur- 
sive learning process necessary for accomplishing this objective can be viewed as one of up gradin g the fuzzy mem- 
bership values furnished by the teacher for each training sample, simultaneously with the learning of the underlying 
distribution parameters and the level of imperfectness of the supervision available in the environment. 

The input to this recursive triple learning process consists of : 

• a set of training samplcsor feature vectors (xj:i= l,...n) 

• a set of corresponding fuzzy label memberships {{ vjj: i = 1, . . . n }; j= 1 m} 

These labels memberships are assumed to have a level of reliability 0 which is unknown at the start of the 
learning process and is learnt during the teaming process simultaneously with the parameters of the underlying dis- 
tributions. This learning begins with an assumption of [5 = 1/m, i.e., the labels are essentially disregarded. Thus, 
initially each sample would have membership values in all the given classes in proportion to the a priori probabili- 
ties of these classes in the environment since we do not as yet have any measure of confidence in-the unfamiliar 
teacher furnished fuzzy labeling information. Under equal a priori probabilities of the classes, the m '- ^ivTchip f imr . 
tion values for each sample will be 1/m provided, of course, the environment is completely «posfd . ie., all the 
classes expected in the environment are represented in the training set Otherwise, one will have to this 

scenario additional concepts such as learning in partially exposed environments [9] that have been developed for deal- 
ing with cases wherein all the classes are not representatively known at the start of the learning process. This would 
involve adding the flexibility of a reject option to the classification phase and hence a method of defining or leaning 
the boundaries of the currently known classes relative to the rest of the world in addition to learning the boundaries 
between die known classes. While this is conceivable in the light of the reported developments {91, it is not consid- 
ered here as being outside the scope of the current study. 


As the recursive learning progresses, each sample is assigned probabilities of belonging to the different 
classes by the unfamiliar teacher scheme (in a manner similar to equation (4) of reference {41 but modified to take 
into account multiple classes and the cunent fuzzy membership function values to correspondingly weight the differ- 
a priori probabilities). Then, one can update class fuzzy membership values based on not only the fur- 
nished fuzzy membership values, but also on the relative proportionalities of the a posteriori rmhahii^j^ and B the 
imperfectness of the unfamiliar teacher. Let py be the a posteriori probability of xj being assigned to class j com- 
puted on the basis of not only the feature vector values but also the current fuzzy membership functions and 
imperfectness measure. Then the updated membership function value un, of xj being in class j, is given in 0 f 

the two weighted components as shown in expression (I): 


u ij = a.Vy + a-cOwjj 
where 


( 1 ) 


I 

5 

I 


Wjj = f ( P, m, pjj, j = 1 m ) = L 


PPii 




“i ; J ~ ^ 


k= 1 
*L ; 


(1-P) n 

(m-1) p 'i 


t -t 


PPiLj 


m 


(1-P) 

(m-l) k t, 
*L; 


.- j = !.■ 
'i* L i 


Pik 


'i 


( 2 ) 


, m 


371 


7v5WF»a» 21 







Here, the first component, vy is the one furnished by the fuzzy teacher. The second component, wy, is de- 
termined by the learning system (in a manner simitar to that proposed in the previous study) to accoantfor the un- 
SSSss aspect of tire unfamiliar teacher and o is the relative weighting of the two components. When a -ft 
this effectively corresponds to the scenario studied previously in [6] with the unfamiliar 
bels. At the other end of the spectrum, i.e. a * . 1. we only have the fuzziness defined by the teacher with no term to 
take into account the imperfectness of the teacher within the fuzzy membership model (crisp learning under a fuzzy 
teacher ! - a not very convincing model of learning). A conceptually elegant choice for a is given by the equation 

(3): 


a = 


(mp-1) 

(m-1) 


(3) 


Here as B => 1, i.e. as the teacher progressively becomes more and more reliable, the imperfectness in the labeling 
reduces, a=> 1, more reliance is piaced on the teacher provided fuzzy label information (vy) and less on the recur- 
sively determined component (wy). On the other hand, as p=> 1/m. i.e. the teacher becomes less reliable and rends 
towards the unsupervised scenario, a => 0, the fuzzy membership information provided by the teacher becomes less 
relevant and more weight is given to the component determined by the actual a posteriori probabilities. 


Equation (1) can be rewritten using equation (3) as 

„ = (n»P-l) , rc(l-fl w 

‘i (m-1) ,J (m-1) ,J 


Here, we have 


IP.i-1 


(4) 


(5) 


j = 1 


Substituting equation (5) in the expression (2) we can rewrite (2) as 


(m - 1) P p; 


W U = 


( 1 - p) + (m p - 1) p iL J 


d-P)Pi 


; j = Li 


j = 1. 


( 6 ) 


. m 


[(1 - P) + (m P - 1) PjlJ 


For the special case of m = 2, which corresponds to the classical detection or binary decision problem, ex- 
pression (6) reduces to 


PPiL, 


Wy = 


[(1-P) + (2P- OPiLj] 
(l-PXl-PjLj) 

[(1- P) + (2 P - 1) PiLi ] 


j = Li 


(7) 


Here, it is interesting to note that in equation (6), wy : j = Lj is symmetrically dependent on p and Pi j - j = 
L; As the supervision improves, i.e., as p increases towards unity, the fuzziness due to imperfectness reduces (the 
membership function component wy approaches unity for the class corresponding to the given label), but its relative 
weiehtane in equation (4) reduces. When the a posteriori probability increases (for a given imperfectness level of the 
supervision), the component wy : j * L;, once again approaches unity, and thereby contributes to a corresponding 



increase in uy : j = Lj also, since P is not decreasing. Thus, although this second component (wy : j ■ Lj) is sym- 
metric with respect to the imperfectness level and a posteriori probability, the total fuzzy membership function (uii : 
j » Lj) is not symmetric. 1 ' 


duces to: 


At the other end of the spectrum, when p = 1/m, i.e., with essentially no supervision, «pm««inn ( 4 ) re- 


u y = Wy = pjj ; V j = 1, . . . m (g) 

Thus, under the unsupervised scenario, the fuzzy membership values are dictated wholly by the relative a posteriori 
probabilities of the sample belonging to different classes computed on the basis of the estimated values of the distri- 
bution parameters. For allother values of pin the range (l/m)<P< 1.0, the fuzzy membership value is a function 
of both the relative a posteriori probabilities as well as the reliability level of the labels provided by the »* yb r r as 
given by equations (4) and (6). Since py and P can be computed during the sequential teaming scheme on die 
unfamiliar teacher concepts (using appropriately modified forms of equations (4) and (5) in Reference (4]), we can 
continually update wy and hence uy also. This construct also assures consistency with the of these 

tions, i.e., the sum of membership function values for every sample is equal to unity. For cases wherein the a pos- 
teriori probabilities pjj are all equal for a given sample xj. (i.e., py = 1/m for all j * 1 m) expression (6) re- 

duces to: 


Wy = 


p ; j=L-, 
JLJL • 1*1 

(m-l) ’ J ^ 


( 9 ) 


i.e.. 


We can also derive the case for which the second component of all the membership functions become equal. 


Wy = Wj V j = 1, . . . , m 


( 10 ) 


(m-l)Ppii, 
p.. = !h 


(1-P) 


p*I 


(ID 


interesting to note that equation (1 1) reduces to the previously discussed case of equal a posteriori oroba. 
bihues for all classes when p = 1 /m. y 


In view of fact that the sum of the a posteriori probabilities of all classes is 
(1 1) in effect defines a specific value for the a posteriori probability as: 


unity (equation (5)), equation 


p. (1-P) 

^ [l+m(m-2)p] 


( 12 ) 


Correspondingly, equation (11) becomes 


. _ (m-l)P 

P,j [l+m(m-2)p] (13) 


Equation (12) reduces to 1/m for P * 1/m and (1- p) for m = 2 the two special 
Correspondingly, equation (13) reduces to 1/m for p = 1/m and p for m ■ 2. 


cases previously considered here. 


373 




(0 

(ii) 


(iii) 

(iv) 

(v) 

(vi) 

(vii) 
(viii) 
(ix) 


5. algorithmic procedure 
Set p = P 0 at an appropriate level ( 1/m £ M 1) - the schcme » not very sensitive to this initial 

"new^ 

estimates as known using the fuzzy membership function values toappto^y we.ght the cone- 
spending classes (by modifying the multiclass version ofequation (4) of reference t 4]) 

I -*, « Yi . . *«i, . 

event; else set Yi = 0 

Update p (using equation (5) of reference [4]) 

Update wij using equation (6) 

Update uy using equation (4) 

Go back to step (ii) for the next sample 

Repeat the procedure till are the samples have been processed 

6. SCOPE FOR EXTENSIONS 


* ^ analv ,:, hitherto it was that the imperfectness p of the unfamiliar teacher is constant across 

„ However in the real-world environment, this may not always be true since mfocmanon 

the different pattern classes. ® "T . ._ reliab i e than that available for other classes. For example, in 

may be ™ rehrttefln ‘ Ss truly optimal. Accordingly, the recursive learning process 

into account by the learning process tor me leanu g y different classes. Here, these multiple 

S« y ThS accomplished by employing an estimator illustrated by the following expression. 


PLjPii 


Wy =f(Pj, m,Pij,j = l, ...,m)- 


(1-pLi) 9 „ 

p MPiM + -^T k ti Pik 




(1-PLj) 
(m-1) Pi * 


+ Pik 


*U 


j = Li 


(14) 


j = 1, 

j*L| 


, tn 


374 


However, in order to ensure that the summation across the classes of the membership function uy equals unity in 
general for each sample, the weights of the two components in equation (1) have to be constant and independent of 
the class Hence, equation (3), as defined eariier, can not be validly employed whenever p is not constant across the 
pattern classes. Accordingly, equation (3) is modified to be: 


m 

<£p r i> 

- = (m <P> ~ 1) 

tt (m-1) (m-1) 


(15) 


where <p> is the average value of P across the classes. 

Using equation (5), expression (14) can again be restrocuired as 

^ PM ; j = Lj 

w ... psl,] ' (16) 

,J (l-Pi^Py . j = l m 

[(1- ) + (m PLj - 1) ] j^Lj 

As before, for the case m = 2, i.e., a binary decision case with different levels of reliability for the labels of 
the target samples from the two classes (for example lethal and benign), we have 


W ij = 


Mi] 

[(l-M + GVDPilJ 


(l-PL^Pij 


(l-PL i ) + (2P L .-l)p iLi ] 


i j = L i 


; i* L i 


(17) 


Expression (9), which represents the case of equal a posteriori probabilities py, will also get modified corre- 
spondingly, whenever the impeifcctness levels for the different classes are estimated separately. Expression (14) re- 
duces to: 



1 



! 


W; 


PLi ; j=Li 


(ra- 1) 


J* L i 


(18) 


Again, for the second component of all the different class memberships of a given sample to be equal, (equation (10)) 
can be derived from expression (14) as 


Pi] = 


_ ( m ~ 1) PLjPiL, 


(1-PLi) 


V j 


= 1 ,... 


m 


PLj*l 


(19) 


which as before is again subject to the constraint equation (5). Hence, from equations (5) and (19), we get 


(1-PLj) 

PiL i"[m(m-2)pL i +l] 


;PLj*l 


( 20 ) 


375 


t . .lit.... 


5W 



‘ L'* 



For the binary decision case, i.e., m = 2, this reduces to 


PiL i =a*pL i ); 

Pij = PLi 


(21) 


The corresponding steps of the algorithmic procedure outlined in section 5 should be approprialdy modified to reflect 
this variable nature of (3 across the classes, assuming the P’s to be statistically independent. 


In expression (14), the fuzzy membership component wjj is a function of only the imperfectness of the 
class represented by the given label, i.e., it is independent of the quality of supervision available for classes other 
than to which the sample is assigned by the teacher. A more realistic, but complex, model would be of the fonn: 


W ij * 


r p MPij 

o _ . m( 1 -Pl.) m 

* L i 

m(l-pLi) /f 0 

\ (l-Pj)Pij 


Pk)P ik 


; j- Lj 


m(l-PLi) m 

*****-ri * k ? 

( m- 1 ) * - 1 


* L; 


. j“l.. • 

j^Lj 


» m 


(22) 


It is interesting to note that in expression (22), wij becomes a function of the imperfectness levels of all the 
different classes in the environment while retaining the uniqueness of the original expression (14) for the perfectly 
supervised class case of py = 1* (However, unlike equations (2) and (14), equation (22) cannot be restructured to 
eliminate the summation over pi k because of the presence of the variable p k within the summation term). 
Expression (22) can therefore be viewed as a more realistic portrayal of the imperfectness in the environment for the 
classification phase. Under this model, a crisp (i.e. 1 or 0 value) scenario gets established for wjj whenever just the 
single corresponding imperfectness parameter disappears (Pj = 1). This therefore represents an .OR. logic based de- 
pendence across the imperfectness values. One can also visualize mi AND. logic based version of the model as: 


W ;; 


r My , 

PL i p «-i + 2T5h7 t | 1 «-*<■*>* 
( 2 - Plj- Pj ) 

2 ( m - 1 ) Pi 3 

P L i p iLj + 2 ( m - 1 ) k ? j (2 ' PL i' P k )p ik 
* L i 


i i = L i 




( 23 ) 

m 


376 



7. CONCLUDING COMMENTS 


The study offers a potent tool for learning and operating in imperfectly supervised fuzzy environments. 
This is accomplished by treating the fuzzy environment as essentially unfamiliar at the initiation of the learning pro- 
cess and thereafter learning about the environment in terms of the level of imperfectness and the fuzzy membership 
values for each training sample concurrently with the primary learning task of determining the underlying distribu- 
tion parameters. The major innovation of this study is the development of an unique concept of jointly capturing 
within the defined fuzzy framework both the imperfectness of die unfamiliar teacher as well as the fuzziness indie la- 
beling provided. Admittedly, alternative fuzzy model formulations (such as for example equation (24)), can easily be 
conceived. 

m 


II Pk Py 

1 


fi, PkI> a-i + 2(^T) k ! 1 <2 - pL i' Pk)p ‘‘ 


*L: 


w ij = 


; j=Lj 


2(m- 1 ) P, J 


( 24 ) 


. j = 1, . . . ,m 


0 1 p ‘i > iiH + 2(irT)J 1 ( 2 - p 4- p ‘ ) ' , ik 


j Lj 


The approach can also be extended to dynamic environments by combining the VEDIC teacher concepts [5,7] with 
the dual-component based membership function learning methodology developed here. 

ACKNOWLEDGMENTS 

The author would like to take this opportunity to gratefully acknowledge the IR&D funding support pro- 
vided for this study by Mr. Tom Baumbach, Executive Vice-President, Dynetics, Inc. 

REFERENCES 

[1] A. K. Agrawala, “Learning with a Probabilistic Teacher” IEEE Transactions on Info? nation Theory, Vol. 
IT-16, No. 4, pp. 373-379. July 1970. 

[2] K. Shanmugam and A. M. Breiphol, “An Error Correcting Procedure for Learning with an Imperfect 
Teacher,” IEEE Transactions on Systems, Man, and Cybernetics, Vol. SMC-1, No. 3, pp. 223-229, July 
1971. 

[3] K. Shanmugam, “A Parametric Procedure for Learning with an Imperfect Teacher,” IEEE Transactions on 
Information Theory, Vol. IT-18, No. 2, pp. 300-302, March 1972. 

[4] B. V. Dasarathy and A. L. Lakshminarasimhan. “Sequential Learning Employing Unfamiliar Teacher 
Hypotheses (SLEUTH) With Concurrent Estimation of Both the Parameters and Teacher Characteristics,” 
International Journal of Computers and Information Sciences, Vol. 5, No. 1, pp. 1-7, March 1976. 

[5] B. V. Dasarathy and A. L. Lakshminarasimhan, “Learning Under a VEDIC Teacher,” International Journal 
of Computer and Information Sciences, Vol. 8, No. 1, pp. 75-88, March 1979. 

[6] B. V. Dasarathy, “FLUTE: Fuzzy Learning in Unfamiliar Teacher Environments,” Proceedings of the IEEE 
International Conference on Fuzzy Systems, San Diego, CA., pp. 1070-1077, March 1992. 

[7] B. V. Dasarathy, “Fuzzy Learning in Vicissitudinous Environments,” Proceedings of the 11th I APR 
International Conference on Pattern Recognition, The Hague, Netherlands, August 1991 

[8] E. A. Patrick, Fundamentals of Pattern Recognition. Prentice Hall, 1971 

[9] B. V. Dasarathy, Nearest Neighbor (NN) Norms: NN Pattern Classification Techniques, IEEE Computer 
Press, California, Chapter 5, pp. 235-284, 1991. 


377 




UNGLAS 



N93-29559 

Oj - G*/ 

Some Problems with the Design of Self-Learning Management Systems /£ 43 


Ziny Flikop 

NYNEX Science and Technology, Inc. 

500 Westchester Avenue, White Plains, NY, 10604 USA 



/ ; 


Abstract 

In this paper some problems in the design of management systems for complex objects are discussed. Considering the 
absence of adequate models and the fact that human expertise in the management of non-stationary objects becomes 
obsolete quickly, the use of self-learning together with a two-step optimization of on-line control rules is suggested. 
To prepare far the object analysis, a set of definitions has been proposed. Traditional and fuzzy sets [1, 21 approaches 
are used in the analysis. To decrease the reaction time of the control system, we propose the development of control 
rules without feedback. 

Keywords: Control Processes, Decision Theory, Fuzzy Sets, Optimization 

1 Introduction 

Automatic and semi-automatic control and management systems usually are based on sets of control rales. The devel- 
opment of such rules requires either comprehensive human expertise or an adequate object model or both. However, 
the design of reliable models of complex objects is often a very difficult task and human expertise in the control of the 
non-stationary objects becomes obsolete with time. The traditional use of a control with feedback results in prolonged 
reaction time. These problems can be partially avoided if a self-learning approach is used for the development of con- 
trol rules. In this paper, which is but another drop in the sea of control and management literature, we too are propos- 
ing and studying some variant of such an approach. 

The methodology for synthesizing management systems depends on the complexity of the controlled object In this 
paper we will discuss the management of complex controlled objects with multiple inputs and outputs. (Queueing 
networks and assembly plants are examples of such objects.) The object transformation function (mapping) is defined 
by an object organization (structure) and by the values of the object element set- , parameters (values of control vari- 
ables). Object performance is evaluated by multiple criteria (via multiple controlled variables). Management system 
performance is evaluated by the management system’s ability to maintain outputs at the predefined level (the simple 
control task) and by its ability to minimize the “cost” required for the control (the optimized control task). Modem 
management systems are based on relatively powerful computers and execute their tasks by varying either control 
variables and/or object structure. In the development of management systems, one should consider that management 
system reaction time must be much shorter than input drift, and processes of environmental and structural changes. 

The ability of management systems to control and optimize objects depends on the efficiency of the algorithms used 
for these purposes. In turn, these algorithms depend on continuity, separability, and monotonicity of controlled object 
mapping functions. In general, the behavior of mapping functions depends on the nature of controlled objects. If, for 
example, simple physical devices often have mapping functions that are continuous, separable, and monotonic, this is 
not always the case for more complex controlled objects. When mapping functions are continuous, separable, and 
monotonic, relatively simple control and optimization algorithms can be applied. However, when mapping functions 
are not continuous, separable, and monotonic, then significantly more powerful algorithms are needed. Because the 
behavior of mapping functions imposes limitations on the selection of control and optimization algorithms, and since 
proposed self-learning methodology is based on such algorithms, let’s define what we mean by continuity, separabil- 
ity, and monoionicity. Let’s also define several other terms used in this paper. 

2 Definitions 

From a management point-of-view, an object can be defined via the following mapping functions: 

F(V,H):X->Y,F(X.H):V-*YanAF(X,H):V-*Q (1) 


378 



where: 


X = (x 1 ,* 2 , is an input vector m = \M\,M is a set of inpra variables; 

y= (y 1 ,)! 2 , is an output vector n = \N\,N is a set of output variables; 

y _ ( v \v 2 ...,v p ) is a vector of control variables, p = \P\,P is a set of control variables; 
q _ qi' t q b ~) i s a vector of controlled variables, b - Ifll.B is a set of controlled variables; 

H = {hj} is a controlled object structure, k t is an object element, leL . 

We can define some properties of these functions, which are useful fa management system development: 

a. Let’s consider a function F(V,H) : (X+X) -» (Y+n) with any fixed v and H as continuous into space C if 

when X-+ 0 , then u-+0 for VXe C, (X + X.) e C. (2) 


b. Let’s consider a function F(X.tf) : (V+tj) (r+n) with any fixed X and H as continuous into space A if 

when ij->0. then p ->0 for VVe A, (V+n) e A. (3) 

c. Let’s consider a function F (X, //) : ( v + n) -* (G+P) with any fixed X and H as continuous into space L if 

when then p-»0forVVeL, (V + tj) e L. W 


d Let’s consider a function F ( V, H ) : X -> Y with any fixed V'andtf.andXeCas monotone if when X is 
changing in one direction along same monotone trajectory into C, then Y is also changing in one direction 
along a monotonic trajectory into output space. (3) 


e Let’s consider a function F (X, H ) : V y with any fixed X and H and V e A as monotone if when V is chang- 

ing in one direction along same monotone trajectory into A , then Y is also changing in one direction along a 
monotonic trajectory into output space. W 

f. Let’s consider a function F(X,H) : V-»G with any fixed X and H and Ve L as monotone if when V is 
changing in one direction along same monotone trajectory into L, then Q is also changing in one directs*; 

along monotone trajectory into controlled variable space. W) 


g- 


Let’s consider a function F{V,H): X-*Y as separable if F: (x .x 2 , 
F:{x\x 2 *4 A/, ...,**) -»(!'+*') then F: (x'.x 2 x*+A\... 


..,x i + A , ....,x") -» (I'+k*) and 
x /+ a /,..., x ~) -»(y+K i + K0 (8) 


h. Let’s consider a function F(X,H):V^Yas separable if F:(v\v 2 , .... v* +A‘, ... ,f) -» (X+ ■**) and 
F:(v*,v 2 ,...,v4 A J ,...,v F ) -* (X+k') then F:(v',v 2 ,...,v‘ + A*....V+A' J) -> (X+k'+ic') (9) 


i. Let’s consider a function F (X, H) : V -> Q as 'parable if F: ( »’. » 2 . 
F: (v'.v 2 v' + A' vO -» (Q + V) then F:(v',v 2 v‘ + A‘,... 


.... v‘ + A‘ v*) -* (G+«p‘) and 

v'+A'.-.v'’) -*(Q-*V-tV) 00) 


We can define the fluctuation range of i, (ieM) input variable by an ordered A‘ set that consists of real numbers j 
representing possible measured values of this variable. The whole input space is. 

A = A l xA 2 x...xA m 

We can define the reference (desired) value y‘ r for every output variable i, (ie N) and the reference output vector 
Y r = (y' r , y 2 .... y") fa the whole object. 


We also can define the permissible output space: 

<t> = ly| + a 1 .y ' - a 1 1 x ly 2 + a 2 -y 2 - a 2 l x . . . x [y* + o*.y* - o*l 
where a‘ is an accuracy of tracking i variable; [y‘ r + o‘.y‘- a'l is a permissible interval of i output variable. 


379 



I 


Earh permissible interval also can be represented by a normalized fuzzy set (/ with a membership function 
b • (y) , y e suppU'. In this set, yj = y‘ r has a maximal possible grade b^ (yj) = i . In this case, the permissible output 
space 0 can be defined as: - 

0 = suppU 1 x suppU 2 x ... x suppU" (13) 

The actual output vector Y, at moment t usually differs from Y r . This difference is the result of ei^ier X, drift or map- 
ping function changes caused by V instability and environmental and H changes. Deviation of Y, from Y r is a control 
error for which the management system must compensate. Compensation can be done either by varying only V or 
only H or by simultaneous changes V and H . 

The quality of control is evaluated either by an output error vector that at moment t is: 

tt ( = (e}, ef e") , where ej= y‘-y' r , (ieN). (14) 

orbyS = £ ( l - * - (y j) ) that can be used for estimating a degree that Y,e O. (15) 

ieN * 

We will consider that the controlled object is working within a required accuracy if either of the following conditions 
are satisfied: 

for Vi. (i e N) (yj,+ a') < yj > (y‘ r - a‘) or y\ e supptf (16) 

or Y, e or y, e 0 (1^) 

All X, for which conditi ons (16. 17) are satisfied for some combination of VH , are permissible input vectors for this 
r nmhinatinn We can propose two definitions of permissible input subspaces S 1 for z-lh combination of VH: 

H'= { xi X -* Y e QorY e 0 for z-th combination of VH ) 

< 18 > 

where is a permissible input interval of i input variable which provides conditions (16, 17) for z-th 

combination; 

m are correspondingly maximal and minimal permissible values of i input variable. 

We also can define each permissible i input interval as a normalized fuzzy set W“ with a membership function 
6 (x) , x e suppW“ . Corresponding permissible input space H" is: 

W 11 

= suppW 1 * xsuppW 21 x ... xsuppW”* (1-0 

We can define for each output variable / at any time t a distance of y‘, from the border of a permissible output interval 
either via: 


5‘ = mil 1 (j y‘ + a* — y‘J ,{ y‘ r + a‘ - yj ) or via membership grades as 1 - b & (y‘ ( ) (20) 

For the output vector we can use either. 

D,= 1 1 (8j) 2 or (15). (2D 

Vie N. 

The efficiency with which the management system executes its controlling functions can be different for different 
r-nnhinatinns of V and //. We can introduce a multivalued goal function G that can be used for evaluating the effi- 
ciency of the management system and optimizing the object' 

C t = 'IH (22) 

fet 

where / is a weight coefficient of f controlled variable, /e B. 


i 


{ 

i 

1 

j 

i 


380 



3 Management 

Management systems should be able to work in two interrelated 


Simple control mode. This is either a process of minimizing an output error vector 
min K , = <«,*. ep or min 3 = £ ( : - 

V,// v,w .7* * 

oraprocess of confining K, to permissible output error space (12, 13), which is executed by varying either 

tog ? ther * ,th H -. Fo 5 successful “ndifions (2) and (3) must be satisfied. Controlling algo- 


rithms can be relatively simple if conditions (5, 6, 8, 9) are also 


2. Optimized control mode. This is also a process of the object control corresponding with (24). However here 
the object performance is optimized by varying either only V,or /and H: ’ ■ 

max G- V •// upon satisfaction (16, 17). 

V,H 

Relatively simple algorithms can be used for this optimization if conditions (4, 7, and 10) are if cosiditioiK 

(5-10) are not satisfied, then the algorithm proposed in [3] can be recommended. The optimization (23 24) thm is 
executedby varying V is based, in general, on non-linear programming. During such an optimization, conditions (16 
1 "2 can be preserved relauvely easily. However, the optimization that is executed via controlled object structural ’ 

changes ts based on the combinatorial approach. During combinatorial optimization, conditions (16, 17) can be unex- 
pectedly violated, since any changes of H create a significant destabilization effect on the object To a- n— 

sibdity of violations of (16, 17), the management system should, before changes of Ware made, try wStTrhltT 
the center of <t> ( e). This can be done, for example, via thorough V optimization. The occurrence of Y near the cen- 
tCT of <t» (0) is an indication that the object has an excess of stability. As a result the optimization executed via 
changes of H becomes possible. 


3.1 Models 


Different management system modes require the use of different models. Namely, the simple control mode requires 
the input-control vanables-object structure-output mapping model. This model reflects F{V,H):X->Y (Frame la) 
and W) ‘V-* Y (Figurelb). For the optimized control mode, a control vanablcs-controlled variaUes-maorara 

model that reflects F' (X.H) :V->Q (Figure lc) should also be used. Both models arp rtiMfpH jnrl a — - *t 

self-learning process. 




Figure 1 


3.2 Control Approaches 


For object stabilization, two approaches based either on V t H i = f(Y,) or Vfl, , /(*,) can be used. The first 
approach uses a feedback scheme, i.e., the management system constantly monitors conditions (16, 17) and corrects 
output, ffnecessaiy, by vaiyrng V -nd W. This approach is relatively accurate. However, it is slow, since control deci- 
dela y ed hy Ki ob)*** X ~* Y transformation time and by a decision process that requires CPU time The sec- 
ond VH = /(*,) approach does not use a feedback scheme aU the time. Instead, the management system 
continuously monitors condition X, e H* (X,e 4") and makes decisions, either that dteVW combination has io be 
changed and thus what has to be done to satisfy (16, 17), or that no change has to be made. In other words, the second 


381 





approach uses rules of this kind: “If X,e 2* (X,e 4"), then do nothing. If X, e S« (X, t 4'*), then find other £ r (4« ) 
to which X belongs and change the object in correspondence with YH r ". This approach is faster than the feedback 
approach, but requires that VH = /(X) (reactions) for VX will be prepared in advance. To decrease reaction time, we 
wiU study the possibility of using a second (without feedback) controlling approach .in combination with self-learning 
and adaptation. 

4 Self-Learning and Adaptation 

The purpose of self-learning is the development (modification) of control rules base! on cause-and-effect information 
received via trials. In the absence of analytical models, it is recommended that trials be made on the real objects. 
Self-learning consists of three phases. Namely: 

1 . The preliminary cause-and-effect trials phase. This phase is dedicated to the analysis of how linear F(V,H,: 
X->Y, F(X,H):V-*Y, and F (X, H) : V -» Q are, and where conditions (2-7) are satisfied. Conditions (8-10) 

must also be studied. 

For the study of F (V, H) :X -> Y and for the fixed V and H , we will either observe natural X fluctuations on the real 
object, or actively change X on the model or on the real object. For each X, a value of Y is defined. This process is 
repeated for different v and H. Similarly, for the study of F(X,H) : V-* Y, and for the fixed X (if it is possible) 
and H, we will vary V and define for each V a value of Y. This process is repeated for different X and H. The pur- 
pose of this is to check (2, 3, 5, 6, 8, 9). 

Ft* the study of F (X, H) : v -» Q , and for the fixed X (if it is possible) and W, we will vary v and define for each 
V value of Q. This process is repeated for different X and H . The purpose of this is to check (4, 7, 10). 

The number of such trials is dictated by the desired accuracy of verification of (2 -10) and it must be held to the 
minimum. The results of the first phase are needed for the selection of optimization algorithms. 

2. The development of the control rules phase. This is implemented via a two-step object optimization. During die 
first sttp, an VH combination (feasible solution) is received for the analyzed X in correspondence with (24) (Figure 
2). The second step is the selection of the optimal (in correspondence with (25)) VH combination for the same X . 
(The use of optimization for developing rules allows us to restrict the number of analyzed VH combinations only to 
the combinations used during optimization. As a result, the analysis of all possible combinations of VH , required 



Control rule development starts from some X . When the optimal VH combination is received for this X, then a per- 
missible input space H* or 4" for this combination is defined on the hasis of (18) or(19) by varying eithermodel or 
real object inputs (Figure 3). 



Figure 3 


382 





-J1 


Consequently, we will formulate a rule: “If Xe B‘ (Xe 4") , then H‘V combination has in tv, rhny n " Then the 
two-stage optimization process is repealed for some other X e 2* (Xe 4") . As a result, we will find other V'H' 
combinations and other input subspaces S' or 'V' (Figure 4). 


E z 0i' z ) 



<*»(©) 


Figure 4 


After this, the next X (neither XesJ (Xe 4") nor X « 2' (Xe VO) is selected and the process is repeated. Input 
vectors are selected for the analysis until either the whole input space A becomes decomposed or a feasible solution 
for the same X is impossible to find. As a result of decomposition, some input subspaces can intercect; i.e., more 
than one feasible solution exists for some X; i.e., Xe'V.'V = o *•" * 0or x e S . s = as' * 0 .where 2 
or M 1 represent a* -th intersection. * «*z * r t ’ “i 


If X, e 2‘, and X, e ^and if conditions (3) and (6), or (9) are satisfied, then a combined weighted rale [41 can be 
executed to provide Y t ee. 


When the controlled object is non-linear, it is possible that |H*| * |E r | ()4"| * |4"|). We should try to avoid intersec- 
tion situations because the more input space that belongs to the intersections, the more VH mmhinafcm have to be 
analyzed. If 2* (4") represents an object stability input space for the r-th combination of VH, then 2 = u 2‘ 

{'V = ^'V*) represents a total object stability input space. »« z 

When 2 = 4 (4» = A), then the object is stable. If He A (M»€ A), then the object is only partially stable. Results 
consisting of optimal input subspace-object structure-control variable vector rules should be stored in the irmut- 
reaction table. ^ 

3. The adaptation phase. Because an object and/or object environment rsually are non-stattooary, the mannym^ 
system’s ability to control and its performance efficiency degrade with time. To maintain the controlled object per- 
formance at the predefined level, the management system should constantly monitor the validity of the developed 
rales and the values of G,. The purpose of monitoring is to detect a moment when numqgwnfm mefg. 

cient When management system inefficiency is detected, mother self-learning process is necessary. 

Depending on the peculiarities of controlled objects, we will use different approaches in the self-learning. 

Some controlled objects or their models can be studied in the test-bed mode. In this mode X, is stabilized as X and 
precisely measured. Other vectors (Y,v,Q) can be precisely measured also. The test-bed mode permits implementa- 
tion of a special algorithm during the first and second phases of self-learning. Moreover, self-learning process can be 
executed automatically. Alp oriihms for the (23, 24) optimization are selected depending on the results of the cause- 
and-effect phase. If the tes: d mode is unacceptable, then self-learning has to be implemented on the real object in 

the on-line mode. In this ci conditions (2 and 5) are analyzed within the first phase. This analysis is made via either 

natural or specially created tanges of input variables upon fixed VH. If the algorithm [3J is used for the (23, 24) 
optimization, then the analysis of conditions (3 and 4) can be made during the second phase. Moreover, al gorithm ni 
does not require satisfaction of conditions (6 - 10). 


References 

[1] L. A. Zadeh, “The Concept of a Linguistic Variable and its Application to Approximate Reasoning.” Information 
Sciences, 8 pp. 199 - 249, 301-357 and 9 pp. 43-80. 

[2J M. Sugeno, “An Introductory Survey of Fuzzy Control,” Information Sciences (1985) 59-83. 

[3] Z. Flikop, “Routing Optimization in Packet Switching Communication Networks,” European Journal ofOoera- 

tional Research 19 (1985) 262-267. 1 ^ 

[4] R. R. Yager and D. P. Filev, "Identification of Nonlinear Systems by Fuzzy Models,” (forthcoming). 


i 


383 




[5] AS Tanenbaum. Computer Networks, Second edition, Prentice Hall, New Jersey, 1988. 

Appendix 

This appendix presents an example of the application of the methodology proposed in this paper. Let’s consider the 
packet switching network (Figure S) analyzed in (5, pp. 304-305]. The network structure H is defined by a set of 
nodes, their connecting lines, and the line capacities. ' 



The numbers displayed in Figure 5 represent line capacities C" (where i is a line number, i — 1 16)inpackets- 

per-second. In this network, traffic transmitted between any source and destination nodes can be split between differ- 
ent paths. 


In the described network: 

F ( V »ie H) '■ A imp 

where V Ktl = (v* 8 , .... v° CBA .... / E ) is the control vector that is implemented in the network via node muting 
tables and defines sets of paths for every source -destination pair of nodes. It also defines in what proportion traffic 
must be split between paths. For example, v° CBA is the portion of the traffic transmitted from source node D to desti- 
nation node A via path DCBA . (Control vectors are presented in Thbie I, columns “Curve A”, “Curve B", and “Curve 
C”) 

A ., = (A aS , a ac K tc , .... \ EA ) is an input vector, where A a8 , for example, is traffic that is entering the 

network via node A and destined for node B. When traffic between nodes A and B is split, then A* 8 = ^ li;/* 
is a set of paths between which traffic from A to B is split; V is a traffic in the path j. >«/** 

One can see examples of splitting in Table 1, “Path” column. 

A ttW = (a**, a AC . .... A* C , .... a EA ) is an output vector. A Aa is, for example, traffic that is entering the net- 
work via node A . is transmitted by the network to node B, and is leaving successfully the network via node B. (When 
traffic approaches link capacity, then the network can drop traffic to avoid congestion. In this case A ew < A ( . .) 


F(H, A 
is a function that 


. 1 ^ 


:fines the nun 


A = (A 1 A\ .... 


ling of a control vector into a controlled vector 

that represents the actual (in pkis/sec) traffic in every link of the network. 


Here V = £ A**, (25) 

where A** is traffic in the ** path (*‘ € K')\ K* is a set of paths that contain link i. All three parameters (A* 1 , kf, and K") 
are governed by V MI . 

16 

We evaluate the control <b of the network via the function: <t> = £ *•*/ ( c ‘ - *■’) (26) 

i« i 

For simplicity, let’s evaluate an efficiency G of the network according to the following: 

16 

G = £c* (27) 


384 



For illustration, let’s define the network task as one that provides a control upon * si^Oand minG when traffic 
between nodes BFandCE varies and is fixed for all other source-destination pairs of nodes. 

When v provides C* > V for Vi, then mapping F^fO-A... -A.„ * continuous, separably and rraw^ , 
and a *= a If a routing optimization algorithm allows for the continuous changing of V Mt , then [napping 
F(A ;*“«) : -*A) is also continuous. However, it is non-separable because traffic in every Imk is defined vta 

(25) and (26) is a non-liner function. 

For the developing network control rules, we use the model and optimization algorithm proposed in [3], This model 
and optimization algorithm allow for the continuous changing of V MI and they do not require 
optimized function. Development of the rules can be done in the test-bed mode. Stncewe have already checked the 
continuity, separability, and monotonicity of the F (V„, H) : A iltp -» A aiU and F H ) : V„, -» A mappings, et 
go directly to the second phase of the control rules development 

We can start the development with traffic values for B - F and C - E soutce-destintUk*. pairs ^hi^pn^in 
161- namely 4 0 pkts/sec for B - F and 3.0 pkts/sec for C - E. During the first step of the opti mizatio n (which is based 
In rCS- network struct!^, the procedure described in [3] chesks the 

vector providing conditions <I>s 14.0, can be found. Such a control vector A was found and isstorerf*® ifijbtel ® 

the column "Curve A traffic (%)”. The value of <b that corresponds to this vector and the analyzed network structure 
Zd traffic values is 13.56 and G = 425 . Since <t> < 14.0, the network has an excess of capacity and us structure can 

be optimized. 

During the second step of optimization, the network structure (link capacities) was changing; 
v was fixed. As a result of this optimization, the capacity of the EF and FE lines were decreed to 4163 pkWstx. 
That corresponds to G = 385.26 and <b = 14.0. Then permissible input spaces were defined. This was done on a fixed 
network structure by varying B - F and C - E traffic, while monitoring * s 14.0 conditions. We started with d*°ng- 
u£l and vi tor that is optimal for B - F traffic that is equal to 4.0 pkt^sec and c - E traffic that is 

equal to 3.0 pkts/sec. Modeling allows us to plot curve “A” on Figure 6/Tbe zone under ^ c ^ e “ *JJ*^ b *L 
Spur space and it defines possible combinations of B-F and C - E traffic for which the analyzed con^lve^pro- 
vides 4>S14.0 . In other words, a control rule “Until B-F and C-E traffic is in the zone A • thesaof 

routing tables, corresponding to “Curve A traffic (%)" of the Table 1, should be used can be applied. 

We can see from curve "A” that the maximal B-F traffic is limited to^pkWsec^C-E^mUrmtrfto 
8 12 pkts/sec To analyze the network’s ability to absorb more B-F traffic, we can choose a value of B F traffic 
which isabove 9.66 pkts/sec. Then we can find the other control vector that provides (5)a n d<t>s 14.0 conjuons^ 
This vector is presented in Table 1 , column “Curve B traffic (%”). Analysis of the network with a new control vector 

allows us to plot curve “B” in Figure 6. 

We can repeat similar procedures for the C - E traffic that exceeds 8. 12 pkts/sec. This give us one more control vector 
(Table 1, column “C”) and another curve "C”. 

As a result of these studies, the following rules can be created: « Until B-F andC-E traffic is in the zone below 
curved’ . rLret of routing tables corresponding to "Curve A traffic (%)" should be used. If traffic is in zone /. dun 
Z!he setof routing tables corresponding to "Curve B traffic <%)" . If traffic is in zone II. thenuse theset ^muting 
Ziblu corresponding to "Curve Ctraffic (%>". These rules apply to the initial H thatccnesponds toG = 425. 

A similar study can be executed with control vector “A” and the network in which the capacity 
were decreased to 42.63 pkts/sec. As a result a curve A , was plotted. Hie previous rule can be modified by adding 
the following: "If B-F and C-E traffic is in the zone below the curve A then routing * 

“Curve A traffic (%)" should be used and the capacity of lines EF and FE can be decreased to 4263 pktslsec. 


385 


T’TS’swsar 



Table 1: Network TYaffic Routing 



386 




























































































Destination 


Traffic 

(pkts/sec) 



Palh 


Curve A 

traffic (%) 


CurveB traffic 



100.0 

too 

10.0 


100.0 

100.0 


100.0 

100.0 


100.0 

100.0 


100.0 

toco 































295 

UNCLAS 




~ n ' : 


¥i 





N98-29560 

/£/ 7 ^ 

A Neural Fuzzy Controller Learning by Fuzzy Error Propagation 


Detlef Nauck Rudolf Kruse 

Department of Computer Science 
Technical University of Braunschweig 
W-3300 Braunschweig, Germany 


Keywords: fuzzy control, fuzzy error, fuzzy error propagation, 
membership function, learning algorithm, neural network 


Abstract 

In this paper we describe a procedure to integrate techniques for the adaptation of 
membership functions in a linguistic variable based fuzzy control environment by using 
neural network learning principles. This is an extension to our work in [2]. 

We solve this problem by definiuing a fuzzy error that is propagated back through the 
architecture of our fuzzy controller. According to this fuzzy error and the strength of 
its antecedent each fuzzy rule determines its amount of error. Depending on the current 
state of the controlled system and the control action derived from the conclusion, each rule 
tunes the membership functions of its antecedent and its conclusion. By this we get an 
unsupervised learning technique that enables a fuzzy controller to adapt to a control task 
by knowing just about the global state and the fuzzy error. 


1 Introduction 

One of the design problems of a fuzzy controller is the choice of appropriate membership 
functions or the tuning of a priori membership functions in order to improve the performance 
of the fuzzy controller. 

We solve this problem by definining a fuzzy error that is propagated back through the 
neural-like architecture of our fuzzy controller. According to this fuzzy error, the strength 
of its antecedent, the current state of the controlled system, and the control action derived 
from the conclusion, each fuzzy rule determines its amount of error and tunes the membership 
functions of its antecedent and its conclusion. This paper is an extension to our work in [2], 
where we proposed a supervised learning algorithm depending on a non-fuzzy error. 

We refrained from just integrating neural nets in certain parts of the architecture as black 
boxes as it is done in other approaches, or from adding an extTa module to the architecture 
taking care of the correction of enors for example by weighting the rules according to the errors 
as it is described in [4, 12]. 

Our maun concern is to keep the structure of the fuzzy controller that is determined by the 
fuzzy rules. We think of those rules as a piece of structural knowledge that gives us a roughly 
correct representation of the system to be controlled. If the actual output of the controller 
differs from the desired behaviour, we consider an unsuitable choice of membership functions 
that model the linguistic values of the system variables to be responsible [8]. 

We understand the adaptations of the membership functions as a reverse mechanism de- 
duced from the forwarding inference machinery. We consider the computation of the control 
value from given measured input values as a feedforward procedure like in layered neural nets 

388 



[10], where the inputs are forwarded through the net resulting in some output values. If the 
actual output is not able to drive the controlled system to a desired state, an error has to be 
propagated back through the architecture changing parameters taking into account the feed 
forward propagation of inputs. 

Because it is usually not possible to determine an optimal control action for a given state, 
we are not able to calculate the error of the produced output directly. This means we cannot 
use a supervised learning procedure like standard backpropagation. But by evaluating the state 
of the controlled system, we are able to determine a global error measure. This enables us to 
define a non-supervised learning algorithm. Training a fuzzy controller with such a learning 
procedure allows us to keep track of the changes and to interpret the modified rules. 

The term “non-supervised” indicates in this context, that there is no “teacher” providing 
a desired output value to be compared to the actual output value. The controller is able 
to calculate the fuzzy error by just knowing about the state of the plant. From another 
point of view one could say, that the system is watched by a supervisor who uses “good” and 
“bad” signals to guide the learning procedure. But this kind of reinforcement learning is not 
considered to be a plain supervised procedure, and so we prefer to call our learning algorithm 
non-supervised, although it is derived from the BP-algorithm for neural networks [10]. 

Ginsidering the ideas on which fuzzy controllers are based, we think it is a natural approach 
to use a fuzzy error for our system, which, according to its structure, we may call neural fuzzy 
controller. 

In the following sections we first present the structure of our controller. Then we describe 
the fuzzy error propagation algorithm that we use as our learning procedure. Next we consider 
some simulation results concerning the control of an inverted pendulum and in the last section 
we discuss our results. 


2 The Neural Fuzzy Controller 

We consider a dynamical system S that can be controlled by one variable C and whose state can 
be described by n variables X \, . . .,X n , i.e. we have a multiple input - single output system. 
For each of the mentioned variables we consider measurements in a subinterval H = [Aj, A 2 ] 
of the real line. The imprecision is modelled by mappings /x : [Aj, A 2 ] — [0, 1} in the sense of 
membership functions with the obvious interpretation as representations of linguistic values. 

The control action that drives the system 5 to a desired state is described by the well-known 
concept of fuzzy if-then rules [13], where a conjunction of input variables associated with their 
respective linguistic values determine a linguistic value associated with the output variable. 
All rules are evaluated in parallel, and their outputs are combined to a fuzzy set which has to 
be defuzzified to receive the crisp output value. The conjunction of the inputs is usually done 
by the min-operation, and for the aggregation of the outputs of the rules the max-operation is 
usually chosen, as it is done by the well-known Zadeh-Mamdani procedure [7, 13]. 

For the evaluation of fuzzy rules the defuzzyfication-operation constitutes a problem that 
cannot be neglected. It is not obvious which crisp value is best suited to characterize the output 
fuzzy set of the rule system. In most of the fuzzy control environments the center-of-gravity 
method is used [5, 6]. Using this method, it is difficult to determine the individual part that 
each rule contributes to the final output value. 

To overcome this problem we use Tsukamoto’s monotonic membership functions, where the 
defuzzification is reduced to an application of the inverse function [1, 6]. Such a membership 



Figure 1: Defuzzification using Tsukamoto’s monotonic membership functions 

toctloB „ is chwocterired by two point, «ith p(«) = » -■ M») = '• “ d “ * deSned “ 
j -« + fl if (z € (fli M A a < 6) V (a: € [&, a] A a > 6) 

/*(*) - | j otherwise 

The defuzzification is carried out by 

i = (T l (y) ~ -y(a-6) + a 


with y € [0, l]. 

Consider the following two rules 

Rll IF 9 is PM AND 9 is PS THEN F is PS, 

R 2 : IF 9 is PS AND 9 is PZ THEN F is PZ, 

where PM, PS end PZ repress the nsnnl linguistic expremion, £»«« -»«*«”•. 

The of thorn “ P”***- 1 '***■'• L v . , 

For o». purposes - «b” 

.»» p - - *. - - - 

of this possibility in our controller for reasons of simplicity. 

i r »»,» structure of our neural fuzzy controller is depicted m figure 2. The 
An ^V^rep^n .h«°npn. vtombles rhn. describe the sure of the system to be 

m ° u Ai I t shortl These modules deliver their crisp values to their p-modules which 
controlled i (plant, for short)- T . ^ ^ linguistic values assigned to the respective 

contain the membership^f Reeled to the following Jl-modules which represent the 

input gables T ^-™ odu of the controUer . Each p-module gives to its connected 

fuzzy if-then rules, the 1 * ( } of its input variable X,. It is possible for each //- 

LTofetVe c„r"s.e,/*iod«.». The * mod.* . • «« (-«»— 




Figure 2: The structure of the neural fuzzy controller 

in this case) to calculate the conjunction of their inputs and pass this value forward to one 
of the ts-modules, which contain the membership functions representing the linguistic values 
of the output variable. By passing through the i/-modules these values are changed to the 
conclusion of the respective rule. This means the implication (min-impb'cation in this case) is 
carried out to obtain the value of the conclusion, which is usually a fuzzy set in a more general 
case. The conclusions are then passed on to the C-module where they arc aggregated (e.g. by 
max-operation), and a crisp control value is determined by a defuzzification procedure. 

In our case, however, monotonic membership functions are used, and so the v-modules pass 
pairs *(»•«•)) to the C-module, where the final output value is calculated by 


•=i 



isrl 


where n is the number of rules, and r, is the degree to which rule rt, has fired. 

From a more general point of view one can interpret the messages from the v- modules to the 
C-module as fuzzy sets dipped by the min-implication at height rj. The C-module aggregates 
the condusions by a max-operation, and uses a non-standard defuzzification procedure as it is 
mentioned above. 




••• 




"UKil 









As one can easily see, the system in figure 2 resembles a feedforward neural network. The 
X-, R-, and (7-modules can be viewed as the neurons and the fi- and i/-units as the adaptable 
weights of the network. The X - and C-Iayer are identified as input layer, and output layer, 
respectively, and the /Mayer serves as the intermediate or hidden layer that constitutes the 
internal representation of the network. The fact that one /x-module can be connected to more 
than one /2-module is equivalent to connections in a neural network that share a common 
weight [9]. This is very important, because we want each linguistic value to be represented by 
only one membership function that is valid for all rules. 

By this restriction we retain the structural knowledge that we put into the system by 
defining the rules. In other neural fuzzy systems this fact is not recognized [1, 3] and it is 
possible that one linguistic value is represented by different membership functions. 


3 The Fuzzy Error Propagation Algorithm 

Our goal is to tune the membership functions of the controller by a learning algorithm. Because 
it is usually not possible to calculate the optimal control action for a given state of the plant, 
so we can derive the error directly by comparing the optimal to the actual value, we are trying 
to obtain a measure that adequately describes the state of the plant under consideration. 

The optimal state of the plant can be described by a vector of state variable values. That 
means, the plant has reached the desired state if all of its state variables have reached their 
value defined by this vector. But usually we are content with the current state if the variables 
have roughly taken these values. And so it is natural to define the goodness of the current 
state by a membership function from which we can derive a fuzzy jt that characterizes the 
performance of our neural fuzzy controller. 

Consider a system with n state variables .Yj, . . ., A'„. We define the fuzzy-goodness G\ as 

G, = min(/i^ j <<ma/ , . . . , 

where the membership functions have to be defined according to the requirements of 

the plant under consideration. 

In addition of a near optimal state we also consider states as good, where the incorrect 
values of the state variables compensate each other in a way, that the plant is driven towards 
its optimal state. We define the fuzzy-goodness G 2 as 

G, = mMfi compenaate ' (.Yi, . . A'„) (A'i A*« ) ) 

where the membership functions /i compel ' ,ate ' again have to be defined according to the require- 
ments of the plant. There may be more than one an( j t ]j ey mav depend on two or 

more of the state variables. 

The overall fuzzy-goodness is defined as 

G = <?(Gj,G 2 ), 

where the operation g has to be specified according to the actual application. In some cases a 
min-operation may be appropriate, and in other cases it may be more adequate to choose just 
one of the two goodness measures, perhaps depending on the sign of the current values of the 
state variables, e.g. we may want to use G\ if all variables are positive or negative and G 2 if 
they are both positive and negative. 

The fuzzy-error that is made by our neural fuzzy controller is defined as 


E-l-G. 

392 



I 


We are now able to define onr learning algorithm that works for each fuzzy rule in parallel. 
Each rule Ri knows the value r t - of the conjunction of its antecedents and the value c,- of its 
conclusion. Because we are using monotonic membership funcions, c,- is already crisp. After the 
control action has been determined by the controller and the new state of the plant is known, we 
propagate the fuzzy-error E and the current values of the state variables to each A-module. If 
the rule has contributed to the control output, i.e. r { fi 0, it has to evaluate its own conclusion. 
According to the current state of the plant the rule can decide, whether its conclusion would 
drive the system to a better or to a worse state. The actual control value cannot be determined, 
but its direction, i.e. sgn(c opl ), is known. For the case sgn(c f ) = sgn(c opt ) the rule has to be 
made more sensitive and has to produce a conclusion that increases the current control action, 
i.e. makes it more positive or negative respectively. For the second case the opposite action 
has to be ' aken. 

Consider that we are using Tsukamoto’s monotonic membership functions. Each member- 
ship function can be characterized by a pair (a,b) such that p(a) = 0 and p(6) = 1 hold. A rule 
is made more sensitive by increasing the difference between these two values in each of its an- 
tecedents. That is done by keeping the value of b and changing a. That means the membership 
functions are keeping their positions determined by their h- values, and their ranges determined 
by |a - 6| are made wider. To make a rule less sensitive the ranges have to be made smaller. 
In addition to the changes in its antecedents, each firing rule has to change the membership 
function of its conclusion. If a rule has produced a good control value, this value is made better 
by decreasing the difference \a — 6|, and a bad control value is made less worse by increasing 
|a-6|. 

The rules change the membership functions by propagating their own rule-error 

_ f -ri-E if sgn(ci) = sgn(c opt ) 
eRi ~ \ r, • E if sgn(c,) ^ sgn(c opt ) 

to the connected p- and i/-modules. The changes in the membership functions of the conclusions 
(v-modules) are calculated according to 

new _ f ak-tr-ej^- |a* - h*| if (a* < b k ) 
afc \ a k + o ■ en, • |a* - 6*| otherwise, 

where a is a learning factor and /^-module Ri is connected through u k to the C-module. If 
v k is shared, it is changed by as much fZ-modules as are connected to the C-module through 
this membership function. For the membership functions of the antecedents (p-modules) the 
follwing calculation is carried out: 

fl new _ f a sk, ^c-e Rr \ a jk) - 6 jfc ,| if < b jk] ) 

jk > \ ajk 1 -<r-£R,-\<ijk,-b i k,\ otherwise, 

where the A'-module Xj is connected to the module Ri through the membership function 
ftj k , with kj € {1, • • and s } is the number of linguistic values of Xj. If fij k] is shared, it 
is changed by as much ^-modules as Xj is connected to through this p-module. 

Compared to learning algorithms used in neural networks one can see, that the error is 
not just passed back through the system, but that it is propagated to the intermediate layer 
constituted by the Jl-modules, where a rather sophisticated evaluation of this error is carried 
out, which is not typical for connectionistic systems. There the error signal is treated equally 
by each component of the network. In our system the J?- modules propagate the error back and 
forward to the n- and t/-modules, respectively, where less complicated calculations lead to a 
change of the membership functions, the “fuzzy weights” from a connectionistic point of view. 

The neural fuzzy controller has not to learn from scratch, but knowledge in the form of 
fuzzy if-then rules can be coded into the system. The learning procedure does not change this 

393 


r 


J 



I 


, Tt tunes the membership functions in an obvious way, and the semantics 

onhT^«°^ »o> blurred by any semantical], suspicious fetors o, we,gh.s attache to 

rules. 


4 Controlling an Inverted Pendulum 

„ ont tbp reslllts of a simulation of the neural fuzzy controller applied to 
In this section we P resent inverted ndulum ; s a well-known nonlinear dynamical 

the inverted P«dulmnj[figur ^ sytem u described b y two state variables that 

system, often used to test f y ^ & mcasured aga in S t the upright position and 

are the input vanaoles to th ’ , , tbe cba nee of error. The pendulum is 

the angle velocity 9, . js the COJ ,a rol output , the force F applied to the base 

controlled by one control varia simplified version of 

sstr 2 — 

(m + sin 1 *)« + i» s *i(2*) - (m + 1 ) sin* = -Fcos*. 

Thn movement of the rod is simulated by a Runge-Kutt. proeedore with a timestepwidth of 

cotnrwm € values*PL, Pm!*PS^ Negativefoto. 

L ^“defined a s ^ « with *.) = 0 for each membemhip function can be found m the 
following tables. , , 

of the system became muc , , «_ 20 and $ — 2 when only the initial 

with extreme initial positions of the pendulum, e.g. 0 - 20 and 0 A 



Figure 3: Monotonic membership functions modelling the linguistic values of 9 




Figure 4: The inverted pendulum 


9 


Si 

PL 

PM 

PS 

PZ 

NZ 

LB 

NM 

NL 

■291 

PL 








K22I 


■221' 








HI 

m-im 

eb 

KB 






mi 

mm 

m 

PZ 

■B 




m&m 





■ai 

NS 




m 




■SB 



Wk 

■ay 

IBM 




USB 

USB 


IBBI 


11 


" 

— 

j 

1_ 

J 

K39 


Figure 5: The rule base of the neural fuzzy controller 




twice. 


1 Th t results of the simulation can be found in table 1. We have only documented the changes 
in the membership functions of (. For the other two variables similar changes hate been u . 

The fuzzy error E has been defined by 


E={ 


1 - minfl - If, 1 - fi) if *gn(«) = *“<*>• - 3 ^S3. -0.3 < « < 0.3 


|fl+ 10d| 


if sgn(0) ^ sgn(tf), -3 < 6 + 10 9 < 3 
otherwise. 


That means the fuzzy error is defined by one two-dimensional and two one-dimensional mem- 
bership functions. The learning rate <r has been set to 0.01. 

Th :r,Y«^ 

~ - — »• — - » * ~ 


395 














— 1 

— 

m&sm 

—"Hfi 

Run 2 

n 

Run 4 

Run 5 

1 

1 

initial d 

0 = 0 

0 = 0.1 

0 = 10 

0 = 1 

II il 

*-»■ to 
o 

0 = 20 

0 = 2 

H 

NT, 

EH 





60.0000 

6U.UUUU 

NM 

-70 

60.0000 

60.0000 

60.0000 

60.0000 

60.0000 

171.9157 

NS 

-40 

40.0000 

tmmm 

47.5219 

49.0139 

64.3483 

74.5797 

N7, 

0 

•13.0000 

■BE3E23 


-10.0961 

•9.6819 

-9.2074 

P7, 

0 

13.0000 

■KEE331 

17.6432 

13.5961 

13.6028 

15.1134 

PS 

40 

-40.0000 



IMM 1 ! 

-47.7372 

*60>5255 

PM 

70 

-60.0000 

■E&lllillU 



-82.9168 

-180.7220 

PL 


\mm 

-60.0000 

-60.0000 



Baasfi 


Table 1: The changes in the ranges of the membership functions of 0 



run 1 

run 2 

run 3 

run 4 

run 5 

0 without 
learning 

4.05 

4.17 

4.29 

n.a.t.b. 

n.a.t.b. 

0 with 
learning 

0.56 

0.65 

0.74 

2.13 

5.36 

no. of trials 

1 

i 

1 

3 

11 


Table 2: The performance of the controller 


n loops: 


i-£H. 

“ n 

i=i 


The controller was able to keep an angle near zero with activated learning procedure, and was 
also able to balance the pendulum beginning from the extreme positions of runs 4 and 5 m 3 
and 11 trials, respectively, whereas the controller was not able to balance (n.a.t :• the rod in 
these cases without learning. 


5 Discussion 


We have presented a learning algorithm for a neural fuzzy controUer based on a fuzzy error 
measure. The structure of the controUer resembles a neural network and the fuzzy error pro- 
pagation can be compared to non-supervised learning procedures as they exist for certain 
kinds of connectionistic systems. Simulations of the controUer have shown that the learning 
procedure improves the behaviour of the fuzzy controUer and is able to handle situations where 
the non-learning controUer fails. 

The introduced fuzzy error measure is suitable for describing the performance of the con- 
troller and aUows each rule to determine changes for the membership functions of its precon- 
ditions and its conclusion. The learning algorithm starts from a predefined rule base that can 

396 




















































r .o,ai knowiedge 

but removes the errors caused by an inaccurate modelling h ? d ^. by tbe user unchanged, 
nsults of the learning proeednreL be earn], Interpn^fsl^Itt^T® ‘^.'T The 
different furry sets describing the same Itaguistic vStoe. ' “ P °“' b ' th, “ ,W) 

Other neural fuzzy control environments whirh arm 

tectures [1, 3], often use factors to weight the rite oTalWh^ °! T* “ etWOrk ai chi ' 
sentations for the same i»P«t^u^ repre-' 

involved, that are different to our approach. This has to be * semantics 

control environment is used. nsidered when an adaptive fuzzy 

An extension to the presented learning algorithm that i« • . . 

" e ‘"° ?k th “ is 10 , ' an ‘ ** — - £L£s 

References 

w ttsstsp* hr *** ^ ^ 

121 

131 *- “ - Kalman 

M B. Kosko: Nenrai Networks and Fatty Systems. Prentic^Hall, Engiewood Cliffs „ 992 ) 

151 sys c , 5 sl *“ - *« ■ ^ 

W *»“■ M» Cjdwm^’vol.^f^no! C °“'° Ue ' ' P “ 1EEE Tra “- 

171 ™l H l“?T“i 5 88 P (mdr °' F ”“ y Ale>ri,h ”“ f “ **•’’*»«' «*»'• P™c. IEE 

!8i L N rF^^^ 

Contml and Migent Systems, Dec. 2 -d, l^C^e Stm^Tea^a^ ^ 

stsssL.iisr' 

CunMdLbML^^f^^ Para “ d Dls "' il,0,ed Processing, Vol. 1. MIT p,es,, 
NN "‘ riV '" F "“ y '"*• E « "‘atoning voi. , 

' 12i ““ ta A “ * Hi—. Processes. 

1131 P~°e“^ 






N93-29561 


DETERMINING RULES FOR CLOSING CUSTOMER SERVICE CENTERS: 
A PUBLIC UTILITY COMPANY'S FUZZY DECISION 


Andrd de Korvin 

Department of Applied Mathematical Sciences 
University of Houston-Downtown 


/&{ t</ 5 


Margaret F. Shipley 
Department of Business Management 
University of Houston-Downtown 
one Main Street 
Houston, Texas 77002 

Robert N. Lea 

Lyndon B. Johnson Space Center 
Houston, Texas 77058 



Abstract 

In the present work, we consider the general problem of 
knowledge acquisition under uncertainty. Simply stated, the problem 
becomes: how can we capture the knowledge of an expert when the 
expert is unable to clearly formulate how he or she arrives at a 
decision? 

A commonly used method is to learn by examples. We observe how 
the expert solves specific cases and from this infer some rules by 
which the decision may have been made. Unique to our work is the 
fuzzy set representation of the conditions or attributes upon which 
the expert may possibly base his fuzzy decision. From our examples, 
we infer certain and possible fuzzy rules for closing a customer 
service center and illustrate the importance of having the decision 
closely relate to the conditions under consideration. 

l. Introduction 

Much effort has recently been devoted to studying the problem of 
knowledge acquisition under uncertainty. Uncertainty arises in many 
different situations. It may be caused by the ambiguity in the 
terms used to describe a specific situation. It may also be caused 
by skepticism of rules used to describe a course of action or by 
missing and/or erroneous data. [See (Arciszewski & Ziarko 1986), 
(Bobrow, et.al. 1986), (Wiederhold, et. al. 1986), and (Zadeh 
1983) . ] 

To deal with uncertainty, techniques other than classical logic 
and the application of statistical methods need to be developed. 
[See Mamdani, et. al. (1985) for a study of the limitations of 
traditional statistical methods.] Rough set theory can address the 
limitations of statistics in dealing with uncertainty while 
allowing rules to be extracted that describe a course of action or 
a decision to be made. [See (Fibak, et. al. 1986), (Grzymala-Busse 
1988), (Mrozek 1985 & 1987), (Pawlak 1981, 1982, 1983 & 1985), and 
(Arciszewski & Ziarko 1986) .] Fuzzy set theory is another tool used 
to deal with uncertainty where ambiguous terms are present. [See 
(Zadeh 1979, 1981 & 1983)] Our work builds on these alternatives 
to statistics, allowing us to infer knowledge from the uncertainty 
associated with ambiguous (i.e. fuzzy) terms. 


398 


2. Development of the Modal 

The main purpose of the present work is to study the general 
situation where the decision-maker is faced with uncertain (i.e. 
fuzzy) conditions and makes a fuzzy decision which might be 
strongly or weakly based on these conditions. In this situation, 
fuzzy rules can be extracted. We shall present the basic notations 
and concepts for developing a methodology to extract such rules 
from fuzzy conditions and fuzzy decisions. [Most of these concepts 
are discussed in (Grzymala-Busse 1988), and (Pawlak 1981, 1982 & 
1985) as they relate to crisp sets.] 

Basic Notations and Concepts 

Let U be the universe. Let R be an equivalence relation on U. 
Let X be any subset of U. If [x] denotes the equivalence class of 
x relative to R, then we define 
B(X) - {x e U/[x] c X) and 

R(X) = {x e U/[x] n X * o). 

R(X) is called the lower approximation of X and R (X) is 
called an upper approximation of X. Then B(X) c x c R(X). If 

B(X) = X = R(X) , then X is called definable. 

An information system is a quadruple (U,Q,V, r) where U is the 
universe and Q equals C u D where C n D ** e. The set C is called 
the set of conditions (attributes) ; D is called the set of 
decisions. The set V stands for value and r is a function from UxQ 
into V where r(u,q) denotes the value of condition q for element u. 
The set C induces naturally an equivalence on U by partitioning U 
into sets over which all attributes are constant. The set X is 
called roughly C-definable 

if B(X) * 0 and R(X) * U. It will be called externally C-definable 
if B(X) = 0 and R(X) * U. It will be called internally C-definable 
if B(X) * 0 and R(X) =. U. 

Unfortunately, uncertainty is all too often present in the 
conditions and the decisions. The conditions and the decisions 
fail to partition the universe into well-defined classes and some 
overlap is present. We will deal with the issue of using rough set 
theory to handle the lack of clearly differentiated partitions by 
using fuzzy sets. We will thus need to N fuzzify w rough set theory. 
Rough Set Notation Applied to Fussy Sets 

Two functions on pairs of fuzzy sets that will be used to 
determine rules for closing a utility company's customer service 
centers (CSCs) . We define: 

I (AcB)=inf Max (1 - A(x) , B(x) ) ( 1 ) 

J(A#B)=Max Min (A(x) , B(x)}. (2) 

Here A and B denote fuzzy subsets of the same universe. The 
function I (A c B) measures the degree to which A is included in B 
and J (A # B) measures the degree to which A intersects B. It is 


399 


clear that I and j-can be expressed as 
I (AcB) =inf (A -* B) 


J (A#B) =Max (A n B ) . 

X 


(3) 

(4) 


In addition, the following relation holds: (5) 

I(AcB) =1 - ^he^fuzzv terms involved in the decision as a 

T a *S2 

animation of* A through (B >, we wean the furry set 

B (A) - U I ( B, C A ) B, lD| 

maicina Drocess may be simplified by disregarding all 
r tr iTb = A ) ia less than so.e threshold «. Then. 

E ' (A) a = u I 1 ( Bj C A ) B, 1 J 

over all B. for which I ( B, c A ) > a. Similarly, we can define 
the upper approximation of A through (B,> as 

r (A) „ = u J ( Bj # A ) Bj 

over a11 B ? !°^ c W T i and J Jill yield two possible sets of rules: the 

Jitn^Ses and ?he jSiiMe The data given for the 

certain rx i les .®^ r u L rcsCs) will be converted to fuzzy 

diagnosis oft ^ Wclfse TcSL 

wilf havf s^^^measure of attributed is 

(possible^ roles) to deoislon have a lower and an upper 

furry . that we have a measure of the minimum degree to 

a Pproximation imation implies the decision and the minimum 

which the lower PP t decision satisfies the upper approximation. 

^ fi^ortant to realize that the present methodology does not 
It is 1° r *f the duality of the decision. What is 

determined is how closely the decision maker seems to depend on the 
determined l . t of attributes. If the decisions seem to 

oo^fstertlv these valSs and if we trust the decision 
III 1 ™ r Ser ha y ,e th ;=gui V red Knowledge, in terms of these 
attributes, as to how decisions are made. 

3 . Application 

ele"ut^y in9 in fhT S5£S£ 

S e „fraSng U and 

operations^are* regulated in Texas by the Public Utility commission 

<PU li* November 1988, HL&P filed a request with the P^lic Utility 
_ . . r nr $432 million rate increase. The public s 

perception of HL&P's stability and sound judgment in the daily 


400 



management of its operations was critical to the outcome of the 
rate case. HL&P needed to show that its decisions and operating 
procedures were initiated with total consideration given to 
effectively serving its customers. 

However, the company's management felt that in order to reduce 
operating expenses in the event that the rate request before the 
PUC was denied one or more Customer Service Centers (CSCs) might 
have to be closed. These customer service centers handled walk-in 
customer traffic for payment of bills and general customer 
inquiries and, thus, were operated for the public's convenience. 
With the rate increase request before the PUC, HL&P had to 
carefully analyze the CSC closing decision. The main consideration 
for HL&P was the public's reaction. Although a decision to close a 
site would potentially impact only a few customers, there might be 
those who challenged the PUC rate hike request on the grounds of 
paying more for less service. .... 

HL&P investigated all relevant factors m making its decision. 
The difference in relative operating expenses of CSCs was 
negligible according to the company's operating and maintenance 
budget. Therefore, operating cost could. not be regarded as a major 
consideration in the elimination of one of the CSCs. Four factors 
could be considered in this decision: the total number of 
customers in a district, the increase or decrease in a district's 
population, the number of customers utilizing the CSC in relation 
to the district's population, and the distance that customers 
would have to travel to an alternate CSC in the event their local 
CSC was closed. (See Table 1.) 


TABLE 1: Customer Service Center Data 


Avg. 

Customers in 
District 

% Change in 
Customers 

Usage/ 

Population 

Rerouting 

Miles 

Bayshore 

38,510 

5.1 

4.64 

15 

Baytown 

36,360 

-1.4 

21.5 

15 

Brazoria 

20,689 

3.4 

14.07 

20 

Brazosport 

21,976 

.4 

8.51 

20 

Cypress 

44,074 

8.3 

1.87 

17 

Fort Bend 

39,145 

5.3 

15.5 

18 

Galveston 

31,263 

- .1 

36.44 

20 

Humble 

55,911 

1.0 

12.44 

15 

Katy/Sealy 

26,760 

2.4 

18.54 

17 

Wharton 

8,707 

- .74 

39.43 

18 

NOTE: All of the above is 

based on 1985-1987 data 

. 


Based upon the data given in Table 1, one of the authors served 
as a decision maker in specifying a value indicative of a high 
number of customers in the district and a low number in the 
district; a great and a small percent change in usage; a high and 
a low percentage of customers utilizing the center; and a large and 
small rerouting distance. A high number of customers was 60,000 and 
a low number of customers was 5000 A great percent change was ± 
9.00 and a small percent change was ± 0.1. A high usage population 


401 






4 . » low usaqe was 1.00 percent. A large 

ratio was 40.00 percent and a L distance was 10 miles. 

rerouting distance was 2 . satisfied the definition of high, 

The deqree to which each site sat xst ^ea ^ given by divi ding 

low r great, small; high, ^lo^ ie ^ by 9 the parameter values defined 

«ose values given in Table 2. 

-- r- *, values for Fussy sens of Conditions 

- M,.- ae usage/ Rerouting 

Avg.eustomers in J^^^g population Miles 


HIGH IOW 


great small high low iarge small 



.567 .020 

.156 .071 

.378 .029 

.044 .250 

.922 .012 
.589 .019 

.011 1.000 
. Ill . 100 
.267 .042 

.082 .135 


,116 .216 


.75 .667 
.75 .667 
1.00 .500 
1.00 .500 
.85 .588 
.90 .556 
1.00 .500 
.75 .667 
.85 .588 
.90 .556 


Bay shore -64C 

Baytown • 606 
Brazoria .345 
Brazosport .36< 

Cypress » / 3 : 

Fort Bend . 65: 

Galveston .52, 

Humble *93, 

Katy/Sealy .44 

Wharto n - 14 " ~ — , . 

Using the total operating revalue ge^rated was le s 

center, our decision m a *er determ nea the CSC would be 

that 1% of the total gener would not be closed if revenue 

SoseJ: conversely, the center vouM nc* be^ ^ ra£ i ectl ve 

valuation o*f' each S^/foV closing and nob closing are given in 

Table 3 . 

M rh CSC & Closing Weight 
TABLE 3: Revenue of eacn « 



Bayshore 

Baytown 

Brazoria 

Brazosport 

cypress 

Fort Bend 

Galveston 

Humble 

Katy/Sealy 

Wharton 




270,411,636 

142,262,298 

44,464,243 

144,290,786 

92,178,304 

88,498,221 

89,125,871 

120,219,083 

53,675,510 

15,660,308 

1,060,786, 2 60~ 


.074 

.115 

.120 

.119 

.088 

.198 

.677 


1 . 


000 

419 

000 

869 

834 

,840 

,000 

.506 

.148 




Of course, no one at sYn^mcst 

finesses 1 define profitability in terns of revenue generated and 


402 









tirto mnrcqentatives had obtained this information, we have 
JssScd that th.' total operate revenue would be the major factor 

affecting Uie of them even unknown to the 

in 5 ea1 ^' himself may impact the decision of closing a 
decision Still, we are interested in learning by 

SSSS hr^lh^e deiiXtn can be *« the attributes 

for which HL&P had accumulated data for each CSC. 

I^Example^l example we selected two attributes: 

Usage/Population and Rerouting Distance. 

* •> .. Jemafa HlO (jllStOiner S 


Fi ^vshore le x s:yto';nr::.rr, 0 » Wharton. Then 0 
Bay snore, Xj Hot Close the CSC. The decision to 


a Rerouting uistan«. 

denote the customer service centers, such that 

» - * - — — (TlU am n st sT® I oQO t*mP 


Close the 
close the 


CSC, and D b = Do --- - 

facility can be evaluated as: , v,e/ v + 120 /x 

D = .039/X. + . 075/ X 2 + . 239/Xj + .074/X 4 + . 115 /% + .120/X 6 

+ * • 119 - /5 i 7 - + h° 88 {hit based^upon‘ 6 revinue generated, Wharton is a 
faiJly ln goo C d a ra BP le% t a CScTo n he r dosed, 9 while Bayshore is not a 

good example of D A . the degree of membership of each CSC 

Likewise, we ca 1 ! indicate ion/attr ibute ; High (H) 

for /T f aC fi»h^n Lw (lS 3 Usage/Population, Large (G) Rerouting 
Distance? and Small (S) Rerouting Distance. For example, we define 

ft Tiff* * • 352 ^ ♦ - 213/x ‘ + • 047/X5 + • 388/x ‘ + 

" ^we^compute 1 the^i^mut^degree^o 'which possible combinations of 

conditions/attributes are related « a*cl=ion D^Th , 


I ( H c „ 

I ( L c D a 

I ( G <= D a 

I ( S c D, 


) = .119 
) = .465 
) = .074 
) = .333 


( H n G 
( H n S 


c D, 
c D 


( L n G c D 4 ) 


) 

) - 


I ( S c D A ) = 

With a threshold of a 

1. 


2 . 


.119 
.462 
.465 

( L n S C D" ) = .465 
the rules for closing a CSC are: 

S£|=i k 

CSC should be closed. (D, is present 465 « • ««5> 

rf the usaae/population percent is high (approximately 4 o* 
lit customers in the district utilize the CSC and the 
rerout ing S d istance is small (approximately 10 males, , then 
i.up esc should bo closed* (Belief *462) 

Jf the usaae/population percent is low and the rerouting 
dsSnce H high (20 mills), then the CSC should be closed. 

4. I^th^usage/population is low end the rerouting distance is 

rnie th no C n^ S r t o d r»ftifn° S U P^^i^d by' rul’es 3 and 4, 

? X "f“age/pSuIadoi°?««nta;e is low then the CSC should be 

2 . lf°usage/population is high’ind the rerouting distance is 
small ■•■hen the CSC should be closed. [The belief is .462.] 

„ , i (e^rtSnlv reasonable. Rule 2 sounds less reasonable. It 
is generati b“the decision maker deciding fairly strongly in 


the 


403 



- u . arton to be closed, although its usage/population was 
favor of Wharton to oe cxo , stance was over .5 small. From 

definitely high and xt re^ut ^ ^ usage and relatively low 

such examples we learn c ™be closed. Note that from the data, we 
rerouting distjmce a CSC these rules. The extracted lules 

do not feel that sur g y closing from past experience. 

« “ now ^“U°ree n lrwMe £ f«» P y setSVersect p, 


as 


j ( H # D A ) = -677 


J ( H n G # D A ) “ .677 


J ( H n S # D a ) - .556 
J ( L n G # D A ) » .115 
- ‘ “ * .115 


J ( L # D A ) = -US 

j < S # S; > = :tll oii»s,Di) 

S^I^usage/populatio^percent^^high? then closing is 

6. lfJerorting 7 distance is great, then closing is possible 

7. i f 7 usage /popul at ion is high and rerouting distance is 

great, then c ^ 0S ?’”^ 1 ^ S be 0 ^ie 7. The* possibility of closing if 
The extracted rule w °^ld rerouting distance is great can't be 

usage/population ishigh reco ^° nded g to be closed with strength 

__ . ttis L WBUlt 2 sh» Rule 3 and Rule 4 to be superfluous to 

R (D ) = .677 H U .677 G U .677 (H n G) 

" ■„ a " 7 R “ 1 e ^appears to be the most logical rule to accept. 

Although , au. nrimarv candidate for closing. It 

it eiiminates Wharton a^ th pr ^ u scores based on high 

sho f* >e u??Uzat?on ^8^ and relatively large as well as 
customer utilizationut ( gQ and >556/ respectively) are 

relatively s ®el . third decision rules. This example is an 

influencing the second and .third aecision rux attributes to 

excellent de^is\on *SSSS 3. In Sis example, the 

properly reflect rte center was to be based solely on revenue 
decision to close a c HTtp would select a center which 

generated. This means t “ L t hat to be closed and the one which 

generated becomes that least likely to be 

generated the hignesx. rem . b e st site to close. 

closed. This ^|“%“\\ t ?o„ ^rcentage at Wharton is high 

leadilg'one to the Jonflusion that, in general, those centers with 

high customer usage should be closed. 

Example 2 aiven to show that a closer relationship 

between° the decision and the attributes selected will lead to 


404 









* 

1 


‘f 


seemingly more logical rules being determined. For this 
illustration ^ we used the size of the customer base with the 
percent usage which suggests that although the percent usage may be 
high, there may be many fewer customers at the center generating 
much less revenue, thus making the center a candidate for closing. 

Using the values of the fuzzy sets High (NH) and Low (NL) for 
the number of customers, and High (UH) and Low (UL) for the 
usage/population percentages given in Table 2: 

I ( NH c D. ) = .088 I ( NH n UH c D A ) * .463 

I ( ML c D. ) * .677 I ( HH n UL c D, ) ® .465 

I ( UH c D. ) - .119 I ( NL n UH c D A ) - .677 

I ( UL c D A ) =» .465 I ( NL n UL c D A ) - .87 

with a = .60, the following rules would be determined: 

1 . If the number of customers is low, the belief that the CSC 
should be closed is . 677 . 

2. If the number of customers is low and the usage/population 
is low, the CSC should be closed .87. 

3. If the number of customers is low and the usage/population 
is high, the CSC should be closed .677. 

Rule 3 is redundant and we would keep Rules 1 and 2. 

Also using a - .60, we can determine the following rules from: 
J ( NH # D A ) = .239 J ( NH n UH # D A ) = .239 

J ( NL # D. ) = .574 J ( NH n UL # D A ) ■ .115 

J ( UH # D a ) » .677 J ( NL n UH # D A ) = .574 

J ( UL # D A ) = .116 J ( NL n UL # D a ) = .113 

4. If the number of customers in the district is low, closing 
is possible .574. 

5. If the usage/population is high, closing is possible .677. 

6. If the number of customers in the district is low and the 
usage/population is high, closing is possible .574. 

From these rules, we select Rule 5. 

Computing the upper and lower approximations based on a - .60, 
we have: 

R (D a ) = .677 NL u .87 ( NL n UL) u .677 (NL n UH) and 
R (D. ) = .677 UH such that: 

Thus, the acceptable rules where Rule 1 and Rule 2 come from 
certainty and Rule 3 come from possibility are: 

1. If the number of customers is low and usage/population is 
low, the CSC should be closed. [ Belief is .87.] 

2. If the number of customers is low, the CSC should be 
closed. [Belief is .677.] 

3. If the usage/population is high, the CSC can be closed. 
[Plausibility is .677.] 

If strictly ordering the CSCs to be closed based upon Rule 2, 
Wharton would be the decision maker's first choice for closing 
( followed by Brazoria and Brazosport) . Although Rule 3 appears to 
be illogical, if strictly ordering a center to be closed based upon 
this rule, Wharton would be selected (followed by Galveston and 
Baytown) . If using the more logical Rule 1, Wharton would not be 
considered first. Brazoria, ranking second in having the lowest 
number of customers and fifth in having a low usage/population 
ratio would be one possible choice for a CSC to be closed. 
Brazosport with the third lowest number of customers and the third 



406 



y *»»ii y»a+>i q would, silso b© © cioocl choice foir 

lowest us ’9f/P°P"l“* 10 " the second choices it strictly 

closure. ?**““ S to mler of customers in the 

££3c? siS the number U Qf ” customers in the district would 

nireetlv* relate to the revenue generating power of • CSC, this 
directly rei realistic result and supports the need to 

haveiell chosen attributes, reflecting the decisions made. 

4 . Conclusions 

4 -v.em nri cm set is a limiting case of the fuz^ setting, 

erpjss 'szd'tr&z? sssiSK 

SSSS£S?TS2J £ 

aSi^es **£■£**” ^S’StSST^S^-k X 

process al l°Y® “ f CO urse,the quality of the learning depends 

examples - °* ^^Un attributes to the decision. . 

UP °?hi h orocess allows rules to be determined through incorporation 
* data for all available alternatives for which a 

° f hf made The decision maker can specify a value he 

decision must be made. Tne oec t d we can calcu iate the 

considers in the fuzzy set. These 

degree subiectively assigned after examination of the 

StribuS n d«a ££?£?££ can !u specified as we did for the 

for determining the 

decision rules s 

min ionally fc intensive, although it does become more 

int?r“e bSond the two attribute with one decision case 
labor intensive oeyonaj authors hope to have a computer 

avauSli "n Per tne near future to handle large-scale 

Pr °5iJin* we stress that the proposed method does not give an 
'"are the decisions made, good decisions?". It is assumed 
SSt the -Xpert is knowledgeable about the conditions under which 
iSJdSiaion will be made. Our methodology gives an answer to "how 
the decision eXDert follow the attributes under consideration 

closely do decision 9 " If the decisions seem to closely follow 
the” values of ^he attributes, then strong rules can be acquired 
through examples and the expert's knowledge can be put into machine 

reP ir?Sis b itme? rT HL & P has not made a decision to close either of 

the customer service centers. Management has relied on reducing the 
the customer service^ th6 centers by moving to the company's 

operating co n neat ion the CSC employees who generally had only 

submitted to HL&P as soon as the prototype computer program is 
completed. 


406 


.W. ' 


Arciszevski, T. and Ziarko, W. 1986. "Adaptive expert system 
for preliminary engineering design," Proceedings $ . Internet !Pn3l 
workshop on Exper t systems and their Applications. Avignon, 
France, 1, 696—712 . 

Bobrow, D.G., Mittal, S. and Stefik, M.J. 1986. "Expert systems 
perils and promises" communica tions of the — ASM, 29, 880-894. 

Fibak, J., Slowinski, K. and Slowinski, R. 1986. "The 
application of rough set theory to the verification of indications 
for treatment of duodenal ulcers by HSV," Proceeding? 5. — 
international workshop on Ex p e rt Systems and , t heir Application s, 
Avignon, France, 1, 587-594. 

Grzymala-Busse , J.W. 1988. "Knowledge acquisition under 
uncertainty: a rough set approach," Journal of Intelligent end 
Robotic Systems . 1, 3-16. 

MamdaniT A. , Efstathiou, J. and Pang, D. 1985. "Inference under 
uncertain expert systems 85," Proceedings Fifth -Techni cal 
conference British C omputer Society. Specialist Srcup cn EXPSCfe 


Systems . 181-194. 

Mrozek, A. "Information systems and control algorithms", 1985. 

Bulletin Polish Academy of Science. Technical Science. 33 , 195-204 . 
Mrozek, A. 1987. "Rough sets and some aspects of expert systems 
realization," Proceedings 7 th International Worksh op on . Expert 
systems and their Applications. Avignon, France, 597-611. 

Pawlak, z . 1981 . "Rough sets . Basic Notions," Jngtltate-CgfflR.Ut.er 
science. Polish Academy of Science RePOXt NOt.41I, Warsaw. 

Pawlak, Z. 1981. "Classification of objects by means of 
attributes," institute Computer Science Polish Academy .Science 
Report No. 429 . Warsaw. 

Pawlak, Z. 1982. "Rough sets,” Interna t ional Journal o_f Jnformat ton 
Computer Science .il, 341-356. 

Pawlak, Z. 1983. "Rough classifications," International - J o urnal ,.. Q . f 
Man-Machine Studies , 20, 469-483. . . 

Pawlak, Z. 1985. "Rough sets and fuzzy sets," FUZZY Jet?. and 
Systems . 17, 99-102. 

Wiederhold, G.C., Walker, M. , Blum, R. , and Downs, S. 

1986 . "Acquisition of knowledge from data," Proceedings ACMSIGART 
international Symposium o n Methodologies for Intelligent SYStSBS/ 
Knoxville, Tennessee, 78-84. . ^ ^ _ 

Zadeh, L.A. 1983. "The rule of fuzzy logic in the management of 
uncertainty in expert systems," FUZZY Sets and SYgt3fflgH> 119-227. 
Zadeh, L.A. 1979. "Fuzzy sets and information granularity, "ftflvansgs 
in Fuzzv Set The ory and APPlicatianS. 3-18. 

Zadeh, L.A. 1981. "Possibility theory and soft data analysis," 
Mathematical Frontiers of the Social .and Policy sciences- Eds. L. 
Cobb and R.M. Thrall, 69-129. Westview Press, Boulder, Colorado. 


r: 


IS 


407 








FUZZY SIMULATION IN CONCURRENT ENGINEERING 


A.Kraslawski, L.Nystr5m 


N9S 


• 5^-67 

29156 ^ 


Department of Chemical Technology, Lappeenranta University of Technology 
Box 20, SF- 53851 Lappeenranta, Finland 


ABSTRACT 

Concurrent engineering starts to be more and more important practice in 
manufacturing. 

One of the problems in concurrent engineering is uncertainty in the values of input 
variables as well as operating conditions. 

The problem solved in the presented paper consists in the simulation of processes 
where the raw materials and the operational parameters with fuzzy characteristics are 
applied. The processing of fuzzy input information is performed by the vertex 
method and commercial simulation packages POLYMATH (1990) and GEMS(1987). 
The examples are presented to illustrate the usefulness of the method to the 
simulation of chemical engineering processes. 


INTRODUCTION 

There are two main reasons to model uncertain knowledge in chemical engineering. 
The first one is the scale of the phenomena. The micro scale is of the growing 
interest for the chemical engineers. The key examples are the biochemical processes 
and new materials technologies. The analysis on the level of agglomerates, cells or 
molecules is of the other type than that on meso or micro scale. 

The second reason is the global change of the estimation of technologies in the 
surrounding world. Environmental, economic, and cultural analysis is needed now to 
answer the question: Is a given technology good or not ? 

In both cases, the sources of the uncertainty is the process complexity. The more 
complex the process is, the less information could be presented in numerical and 
objective way. Such a situation is a consequence of the behaviour of complex 
systems. It is not a consequence of the lack of the good tools of analysis. 

The chemical engineers have to realize that another type of processes requires 
another tools for understanding and description. 

The change in design process is the additional reason for the use of fuzzy 
calculations. 

Design in chemical engineering is a long and complex process. 

Its consequences are a long product development cycle, high manufacturing costs, and 


408 



V- ' 


often, poor final quality. The main reason for this situation is the sequential nature of 
the design process. This way of proceeding results from the fact that the design 
objectives andconstraints are formulated gradually in all stages of the product and 
process development. In the next step the product is tested and, if the criteria are 
not achieved, then the design procedure is repeated. 

The name of such method is serial engineering. 

The present chemical process industry situation is characterized by the growing 
competition, rising degree of complexity, and demand of high quality products. To 
survive in a new situation, the companies have to reduce the time from market 
demand to the full scale production to reduce the costs and to be more flexible. 

These demands evoke the need of new managerial as well as engineering techniques. 

One of the new engineering methods is concurrent engineering 
(Rosenblatt and Watson, 1991, Ishii, 1990 and Hartley 1991). 

Its essence is an integration of various manufacturing, marketing and engineering 
activities. This demand is realized by the team work of the multi-disciplinary groups. 
The people from marketing, design, manufacturing, sales, and services are working 
together. They formulate the required properties of a product, transform them into 
the engineering data, study the resulting manufacturing problems and establish final 
parameters of the product 

The tool for communication inside such a multidisciplinary team is a "house of 
quality", (Hauser and Clausing 1988. Thackeray and van Treeck 1990). 

The integration of various activities results in the simultaneous generation and 
evaluation of the different variants of product and process. The comparison of both 
types of engineering is presented in Fig l. 

The imprecise values of raw materials properties, operation parameters as well as 
product demands are the consequences of the application of the concurrent 
engineering tools in chemical engineering problems. 

The properties of raw materials are imprecise especially in batch and bio- processes. 

It is due to the nonhomogeneity of the substrates that is normal in noncontinuous 
processes as well as in natural products. 

The problems of the imprecision in the operational parameters reflect the fact of the 
contradictory conditions imposed on the process. The contradictions are the 
consequences of the fulfilling of the different criteria. Given criterion could be 
reached at the given set of parameters. In order to obtain the reasonable solution the 
compromise has to be reached. The result of such compromise is the creation of the 
operational ranges for parameters instead of crisp values of variables. 

The variability of demand is a popular situation that results from the market changes. 
Uncertainty of stochastic type could be treated by the well known probabilistic 
methods. However, there has been very few attempts to take directly into account the 
non-stochastic lack of precision in simulation as well as in optimization (Edgar and 
Himmelblau 1989). There are several approaches to study the influence of 
uncertainty on the output variables. The most popular methods are flexibility and 


409 



sensitivity analysis. They are complicated and time consuming. The direct 
introduction of fuzzy variables into the existing packages is the simplest way to 
analyse the non-stochastic uncertainty. There are several approaches to study the 
uncertainty in design (Wood et aL 1991). In the present paper the vertex method is 
applied to introduce the fuzzy values into the existing programs. 


VERTEX METHOD 

The vertex method is based on the a - cut concept and the interval analysis 
proposed by Dong and Shah (1987). It enables the calculation of the membership 
function n, of the following expression: 

y -/C*i .••••*■) ^ 

where x , r . M x n are fuzzy variables. 

Let us assume the triangular form of ft , , membership function. At the given 
a- level, the values of the membership function are [a„ b J as it is shown in Fig. 2. 
As a result for the given x lr .., x„ and « - cut one obtains the set of intervals (a,, b,], 
..., la n , b B J . The set of the intervals forms an n-dimensional region with 2" vertices. 

An example for n=2 is given in Fig. 3. To obtain the y value in Eq.l on the a - level 
one has to calculate: 

yt “ f fa) ***** 3 \ s f fa) ^ 


where c j = (a | ,..., a , ) r .., c r - (b 1 ♦***! b„). 

The y value in Eq. 1 at the level a is expressed as the interval function: 

Y = [min / fa) , max / fa)] ( 3 ) 

The values of Y calculated on the different a - levels create the output fuzzy values 
as presented in Fig.3. 

If the membership functions of fuzzy variables are triangular, then the number of 
runs equals 2° , where n is a number of fuzzy variables. The fuzzy output is 
determined at different a - levels according to Eq.3. The Y values are calculated 
for # = 1 and a =0, in this paper, for the sake of simplicity. 


EXAMPLES 

The examples presented below illustrate the fuzzy simulation in concurrent 
engineering problems. The fuzzy forms of operating conditions as well as raw 
material properties are obtained applying "the house of quality " method. 

The imprecision of the operation conditions and physico-chemical properties is 


410 


studied in the first example. 

The influences of imprecise parameters of raw materials and operational co n ditio n s 
are examined in the second example. 


Example 1. 

The following reactions has been studied by Himmelblau (1970): 
*1 

A+B ~ C+F 
*2 

A+C - D+F 

*3 

A+D - £+F 


The proposed model for the reacting system is as follows: 


- k^AB - k^AC - kgAD 

"--m* 

- M* - MC 


rfD 


ll^C - HgdZ) 



M0 


where the kinetic constant k, = a, exp(- T , / T ) i= 1,2,3. 
a, , b, are constants and T is process temperature. 

The initial and final conditions with concentration expressed in mole /liter and t« me 
in minutes has been reported as: 

A(0) = 0.0209 , B(0) = 0.00697, C(0) = D(0) = 0 and t =200. 

The nominal values of kinetic constants have been reported as: 


411 







k , - 14.7, k, - 1.53, k 3 = 0.294. 

Aim of the study is to determine iue sensitivity of the concentrations A and E 
caused by the imprecise values T. 

Temperature T is given in the form of fuzzy set It is a consequence of the 
appplication of the "house of quality" method. 


Solution 

The following fuzzy kinetic constants result from the "house of quality " method: 

k , = (13.200 14.700 15.500) 

ko = ( 1.180 1.530 1.720) 

k] = ( 0.253 0.294 0.315) 

There are n = 2 3 = 8 vertices c, according to vertex method. Given vertex c, is a 
vector composed of three kinetic constants. The calculations should be realised at 
different a - levels of the fuzzy kinetic constants. 

The values of all vertices at <x = 0 are presented in Table 1. 

If « = 1 then calculations are performed only in one vertex c ( k , = 14.7, k 2 = 
1.53, k 3 = 0.294 ).The system of differential equations has to be solved for all the 
combinations of the kinetic constants’ values. 

As a result the profiles of concentrations A and E are obtained using (POLYMATH 
1990). 

Because there are no extremal points, fuzzy values of A and E after time t = 200 
min could be determined from the Eq.3. In the opposite situation instead of applying 
Eq.3. another approach should be used (Wood et aL 1991). 

At a = 0, according to Table 1, minimal and maximal values of A and E are: 
min A = 0.00551 , max A = 0.00631, min E = 0. 00151, max E = 000189. 

At a - 1 there is only one point to calculate. The results of simulation are given in 
Table 1 for vertex 9. 

The resulting fuzzy concentrations A and E obtained for fuzzy kinetic coefficients are 
as follows: 

A = (0.00551 000573 0.00631) 

E = (0.00151 0.00177 0.00189) 


Example 2 

The problem consists in the estimation of the product characteristics that are 
influenced by the the imprecise properties of raw material and operation conditions: 
The process under consideration is mechanical pulp mill peroxide bleaching. The raw 
material is unbleached pulp and the product is bleached pulp. 

The raw material properties are light scattering, brightness, and initial pulp pH. The 


412 



operating conditions are limited in this example to initial peroxide concentration. The 
product properties are brightness, final pH and metal ions content < . 

The process has been simulated with (GEMS 1987). Usually, the raw material is not 
uniform. As a result its parameters could not be determined in a precise way. As a 
consequence, the operating parameters are uncertain, too. 

The aim is to establish the product properties taking into account the imprecision 

of raw materials properties and operating conditions 

Solution 

The form of fuzzy input variables of raw material is presented in Table 2. The results 
of some simulations are presented in Table 3. Resulting interval values are given in 
Table 4 The fuzzy characteristic of products is determined by Cartesian product of 
two fuzzy sets y , and y 2 (Dubois and Prade 1988). The resulting fuzzy set is as 

follows: 


0 ^ 0 0 ^ — — L — ♦ 

(7.3; 68.77) (7.3; 71.85) (8.53; 68.77) (8.53; 71.85) 

1 , i _ + 3 

(8.75; 75.84) (8.93; 74.32) (8.93; 75.64) 


CONCLUSIONS 

The presented method can be used in concurrent engineering approach to process 
design. The main advantage of the presented method over the existing approaches is 
its ability to study the uncertainty in the raw materials characteristic as well as in 
operating conditions. It could be used with commercial packages without any changes 
of the existing programs. The construction of the * compact " package is the main 
aim for the future. Such a package should be composed of simulator, vertex method 
module and house of quality interactive program. 


REFERENCES 

Dong W. and H.G Shah (1987). Vertex method for computing functions of fuzzy 
variables. Fuzzy Sets and Systems_24. 65-78. 

Dubois D. and H. Prade (1988). Thdorie des possibility Masson, Paris. 

Edgar T.F. and D.M. Himmelblau (1989). Optimization of Chemical Process. 

McGraw-Hill „ , . _ 

GEMSOP-GEMS Optimization Package (1987). Reference Manual Applied Process 

Control Inc., Moscow, Idaho, USA 

Hauser J.R. and D. Clausing (1988). The house of quality. Harvard Business Review, 


413 


ssasr 







Hartleifj\l991). Simultaneous Engineering. Industrial Newletters, Beds. 

Himmelblau D. M. (1970). Proces Analysis By Statistical Methods. Wiley A Sons, 

i«hii ^*11990) The role of computers in simultaneous engineering. In Computers in 

I$ Entering 1990, voL 1, pp 217-224. ASME, New Yorit 
POLYMATH (1990) User- Friendly Numerical Analysis Programs, Eds. M.B. Cuthp 
and M. Shacham. CACHE Corp., Austin. 

Rosenblatt A. and G.F. Watson (1991). Concurrent Engineering. IEEE Spectrum , 

Thackeray R^ind ^ van Treeck (1990). Applying quaUty Function deployment for 
software product development J. Engng. Design L 389-410. 

Wood K. K. Otto and E. Antonsson (1991). Engineering Design Calculations under 

Uncertainty. IFEC 91. Yokohama . 


414 



Table 1. The vertices coordinates of kinetic constants and simulation results. 


1 vertex 
| number 

01 

Oi 

■ 

A 10* 

C 10* 

D 10* 

E 10* 

1 1 


1.18 

0.253 

01631 

0.854 

0.460 

a 151 

2 

13.2 

1.18 

0.315 

0.606 

0.881 

0.431 

a 178 1 

3 

13.2 

1.72 



0.415 

0.495 

0.161 I 

4 

13.2 

1.72 

0.315 

0.551 

0.436 

0.464 

0.189 1 

5 

15.5 

1.18 

0l253 

0.630 

0.850 

0.460 

0.151 I 

6 

15.5 

1.18 

0.315 

0.606 

0.881 

0L431 

0.178 I 

7 

15.5 

1.72 

0.253 

0.577 . .... 

0.415 

0t495 

0.161 I 

8 

15.5 

1.72 

0.315 

0.551 

0.436 

0.464 

0.189 I 

9 

14.7 

1.53 

0.294 

0.573 

0.544 

0.466 

0.177 | 


Table 2. Characteristics of raw material 


x, light 
scattering 

71.9; 72.2; 
15.0; <13 

* 

63.2; 63.8; 

brightnes 

1.1; 0.0 

BUM 

10.8; 11.0; 


0.6; aO | 


Table! Examples of the results of simulation 


x, 

light 

scattering 

* 

brightness 

■ 

1 

pH 

brightness 

metal ions 

71.90 

63.20 

10.80 

3.0 

8.75 

74.32 

6.87* 1CP 


Table 4. Fuzzy output parameters of pulp 


y« 

pH 

0 

8.50; 8.53 

1 

7.52; 7.69 

y i 

0 

68.77;69.7 

brightness 

1 

74.54;75.8 

y, 10* 
metal ions 

0 

5.12;6.48 

1 

5.49^.50 


415 


















































SERIAL ENGINEERING 



quality 
9 safety 


coat 

operability 


CONCURRENT ENQINEERINQ 


quality 
9 safety 
cost 

operability 



Fi 6 1 . Comparison of serial and concurrent engineering 


416 









y ■ a f ( C 1 ) . • • 


y^af ( ) 



min f ( C k ) , max f ( C k )) 


k =1 


r* 


417 




- ■* . 


Fig. 3. Vertex method 


UNCLAS 



N»8- 


INVERSE PROBLEMS: 

FUZZY REPRESENTATION OF UNCERTAINTY 
GENERATES A REGULARIZATION 


2 9 ^ 6 ? 

9. °i 


V. Kreinovich 1 , Ching-Chuang Chang 1 , L. Reznik 2 , G. N. Solopchenko 3 
1 Computer Science Department, University of Texas at El Paso, El Paso, TX 79968 USA 
2 Department of Electrical and Electronic Engineering, Footscray Campus 
Victoria University of Technology, MMC Melbourne, VIC 3000 Australia 
3 St. Petersburg Technical University, St. Petersburg 195251, Russia 


Abstract. In many applied problems (geophysics, medicine, astronomy, etc) we cannot directly 
measure the values x(t) of the desired physical quantity x in different moments of time, so we 
measure some related quantity y(t), and then we try to reconstruct the desired values x(„). This 
problem is often ill-posed in the sense that two essentially different functions x(t) are consistent with 
the same measurement results. So, in order to get a reasonable reconstruction, we must have some 
additional prior information about the desired function x(t). Methods that use this information to 
choose x(<) from the set of all possible solutions are called regularization methods. 

In some cases, we know the statistical characteristics both of x(t) and of the measurement 
errors, so we can apply statistical filtering methods (well- developed since the invention of a Wiener 
filter). In some situations, we know the properties of the desired process, e.g., we know that 
the derivative of x(t) is limited by some number A, etc. In this case, we can apply standard 
regularization techniques (e.g., Tikhonov’s regularization). 

In many cases, however, we have only uncertain knowledge about the values of x(t), about 
the rate with which the values of x(t) can change, and about the measurement errors. In these 
cases, usually one of the existing regularization methods is applied. There exist several heuristics 
that choose such a method. The problem with these heuristics is that they often lead to choosing 
different methods, and these methods lead to different functions x(t). Therefore, the results x(t) 
of applying these heuristic methods are often umeliable. 


We show that if we use fuzzy logic to describe this uncertainty, then we automatically arrive 
at a unique regularization method, whose parameters are uniquely determined by the experts 
knowledge. Although we start with the fuzzy description, but the resulting regularization turns 
out to be quite crisp. 

1. INTRODUCTION 


What is an inverse problem ([TA77], [183], [G84], [186], [I86a], [LRS86], [CB86]). In many 
applied problems (geophysics, medicine, astronomy, etc) we cannot directly measure the values x(t) 
of the desired physical quantity x in different moments c ' time, so we measure some related quantity 
y(f), and then try to reconstruct the desired values x(<). For example, in case the dependency 
between x(t) and y(t) is linear, we arrive at a problem of reconstructing x(t) from the equation 
y(t) s= / k(t,s)x(s)ds + n(t), where k(t,s) is an approximately known function, and n(t) denote 
the (unknown) errors of measuring y(t). These problems are called inverse problems. 

Another example of inverse problems is image reconstruction from a noisy raw data. 

Why inverse problems are so difficult to solve? These problems are often ill-posed in the 
sense that two essentially different functions z(l) are consistent with the same observations y(t). 
For example, since all the measurement devices are inertial and thus suppress the high frequencies, 
the functions x(t) and x(t) + sin(ut), where u is sufficiently big, lead to almost similar values of 


418 



lK<)- So, in order to get meaningful results, we must somehow choose from all possible solutions 
x(t) (i.e., from all the functions that are consistent with the measurement results) a one that is 
the most reasonable, the most regular (in some sense). A process of choosing such a function is 
therefore called a regularization [TA77], [183], [G84], [186], [I86a], [LRS86]. 

Inverse problems are extremely important for space exploration. If we are analyzing 
familiar processes, then we usually know (more or less) how the function x(t) looks like. For 
example, we can know that x(t) is a linear function x[t) = C\ + <7 2 x, or a sine function x(f) = 
C\sin{C%t + C3), etc. In mathematical terms, we know that x(t) = /(t,C,,...,<7*), where / is 
a known expression, and the only problem is to determine the coefficients Cj. This is how, for 
example, the orbits of planets, satellites, comets, etc., are computed: the general shape of an 
orbit is known from Newton’s theory, so we only have to estimate the parameters of a specif 5 c 
orbit. In such cases, the existence of several other functions x(f) that are consistent with the same 
observations, is not a big problem, because we choose only the functions x(t) that are expressed by 
the formula /(t, Cj,. ..,C*). 

In space exploration one of the main objectives (and the main challenges) is to analyze new 
phenomena, new effects, qualitatively new processes, and in these cases no prior expression / is 
known. 

How these problems are traditionally solved? If we know the statistical characteristics 
of x(t) and statistical characteristics of the measurement errors n(t), then we can formulate the 
problem of choosing the maximally probable x(t) and end up with one of the methods of statistical 
regularization, or filtering (Wiener filter is one of the examples of this approach). 

If we do not have this statistical information , but we k now, e.g., that the average rate of change 
of x(t) is smaller than some constant A (i.e., yf J x(t) 2 dt < A), then we can apply regularization 
methods proposed by A. N. Tikhonov and others [TA77], [G84], [LRS86], 

In particular, one of the most widely used (and most efficient) regularization techniques consists 
of choosing among all the x(t) that are consistent with given observations, afunction x(t) for which 
the so-called Tikhonov functional (or Tikhonov stabilizer) 

J(n) = oo f(x(t)) 2 dt + ui f(x(t)) 2 dt + a 2 /(x< 2 >(<)) 2 dt + ... + /(x ( *>) 2 dt 

takes the smallest possible value, where ctj are non-negative real numbers, a* > 0, Jfc > 1, and x<‘>(<) 
denotes * — th derivative of x(t). 

For image reconstruction problems, when instead of a function x{t) of one variable t we have a 
function I(x,y) of two coordinates (that expresses brightness in a point (x, y)), a similar functional 
that involves partial derivatives can be used. 

If no such information is available, it is usually recommended to use Tikhonov‘s (or alternative) 
regularization techniques that correspond to some values of a { . Several semi-heuristic rules of 
choosing these parameters m are known. The problem with these choices is that different rules 
sometimes lead to drastically different results, and therefore these results are unreliable. 

Usually experts possess some uncertain knowledge. The whole situation seems hopeless, 
but it is not. Yes, in new fields we do not have precise knowledge of what is going on, but we mav 
be able to make some uncertain predictions. For example, if we want to know hew the temperature 
on a planet changes with time t, then the experts can tell that most likely, x(t) is limited by some 
value M , and that the rate x(t) with which the temperature chants, is typically (or “most likely,”, 
etc) limited by some value A, etc. We can also have some expert" knowledge about the error, with 


419 



„ tVlo r „,,i t ; n( , expert's knowledge about the value of y(t) in some point t 
’SZZ b«w«nn th. mealed vain. »(<> ^ the rmfnal «... rm » — 

not bigger than 6” (where 6 is a positive real number given by an expert). 


The importance of this information is stressed in [B92]. 


jtko ” sh - s,Mt wi,h ,he toy 

description, but the resulting regularization turns out to be quite crisp. 

In Section 2 we will discuss briefly how to choose an appropriate representation of the experts 
„„ce!ll, in Ltion 3 we use the resulting representations to solve the rnvers, problems. 

2. PRELIMINARY DISCUSSION: 

HOW TO DESCRIBE RELATED UNCERTAINTY 

What we have to describe. We want to use fuzzy logic to describe this kind of uncertainty. So 

Tm Itp^tey representations of the experts statements of the type “most UWy, X is 
W” P or “most likely IX - al < f”, where X is unknown, and M,a,S are known values, 

• dioose a way to combine the resulting fuzzy statements into a membership function for different 

. transform this fuzzy description of x(t) into a single function b * “ * 

solution of the inverse problem, i.e., choose an appropriate defuzzification. 

In the present Section we will describe how to make all three choices. Actually we will start 
with choosing an appropriate combination rule, then we will choose an appropriate membership 
function, and then it will turn out that defuzzification is trivial. 

, function. In general, our uncertain knowledge about the 

Unknown functfon *"l) consists of the statements of the following types: “most likely, |x(t)| < M\ 

-m^Tukely lifOl < A”, “most Ukely, |y(i) - J «U)*W*»I if. «'• (u “ y 

most uxeiy, | i ji _ ... f linct ion x(t) we are not 100% sure whether this statement is 

true forThTs function or not. The general idea of fuzzy logic is to describe this “" certa, “^ by a 
membership function, i.e., by a mapping that assigns to every x(t) a number from the interval [0,1], 
that describes to what extent we believe that this statement is true. 

SnPP 08 ® So^^^ge^a^dbfemn'. ^mb^sh^^imtioi^^^each l moment of^ime 
I and llch .talent. Wn most now generate a membership function that 
knowledge i.e., that describes the fact that the first statement is tine, and the second statement 
tT etc The totrd knowledge is obtained by applying “and” to all the statements, and there or, 
the respiting membership function must be obtained by apply, ng one of the ’<*£***? * • ' j°; *' * 
[0,1] - [0,1] that express “and” to all the correspondent membership functions p«(0- M*(0) 

Ml ’(x(t))&p 2 (x(t))&... 

Experimental results given in [HC76], [077], and [Z78], show that among aU possible “and”- 
operations a, 6 - min{a,b) and a,b - ab are the best fit for human reasoning. The mm operation 
does not seem to be adequate for our purposes, because if we use mm, then, e 8-, the degree, o 
which a function x(t) satisfies the condition “most Ukely, |x(t)l < M , is equal s the mnuinal of 
the degrees of the corresponding statements. This minimum is attained when the value of |x«)| 


420 


the biggest possible. Therefore, the function xi(t) that is everywhere equal to 2Af, gets the same 
degree of consistency with the above-given rule, as the function x 2 (t) that is almost everywhere 
equal to 0, and is attaining the value 2 M only on a small interval. Intuitively, however, for the first 
function xi(t) (for which the inequality is always false), our degree of belief that it satisfies this 
condition is practically 0, while for the second function x 2 (f), for which this inequality is almost 
everywhere true, our degree of belief must be close to 1. So using min in our problem is inconsistent 
with our intuition, and therefore we must use the product for &. 

Comment. Other arguments for choosing different & operations are given in [K83], [KR86], [K87], 
[KKM88], [K89], [K89a], [K90], [KK90], [KL90], [KQL91], [KQLFLKBR92]. 

What membership functions to choose? We want to describe the statements of the type 
“most likely, |X - a| < 6”, where X is an unknown {x(i),x(t), or.y(t)).and a, 6 are known values 
(for example, £ = M and a = 0). So we must describe, to what extent any given value x satisfies 
this condition. 

Evidently, x satisfies the inequality |x-u| < 6 if and only if the value y = {x- a) /6 satisfies the 
inequality jy| < 1. Therefore, it is natural to assume that the statement “most likely, |x - a[ < 6” 
has the same degree of belief as the statement “most likely, |y| < 1”, where y = (x - a)/6. So, if we 
will be able to describe a membership function p(y) that corresponds to the statement “most likely, 
|y| < 1", then we will be able to describe our degree of belief pi(x) that x satisfies the condition 
“most likely, |X - a| < 6" as p((x - a)6). So the main problem is to find an appropriate function 
H(x). 

In the present paper we use Gaussian membership functions p(x) = exp(-/?x 2 ) for some/? > 0. 
Therefore, the statement “most likely, \X - a\ < 6" will be described by a membership function 
Hi(x) = exp(-/?(x - a) 2 /6 2 ). 

Gaussian membership functions are widely used in fuzzy systems and fuzzy control (see, e.g., 
[K75], [BCDMMM85], [YIS85], [KM87, Ch. 5], etc.), and there are several theoretical explanations 
why they are so successful: in [KR86] and in Section 8 of [KQLFLKBR92] we prove that Gaussian 
functions are optimal (in some reasonable sense), and in (KQR92] we describe reasonable axioms 
that uniquely determine Gaussian membership functions. 

A remark about defuzzification. Suppose that we have determined the membership functions 
/ii(x(t)), that correspond to different statements about the unknown process x(t). Then the result- 
ing membership function p(x(t)) is obtained by multiplying the functions Pi(x(t)) that correspond 
to these statements. 

All the values of pi are < 1. So, if we multiply many such values, we end up with very small 
numbers. E.g., if we have 10 experts who all assign the truth value 0.9 to some event, the resulting 
estimate is 0.9 10 = 0.3. Thus, the fact that for some process x(t) the membership value p(x(<)) is 
small, does not necessarily mean that this particular dependency x(t) is hardly possible. What is 
meaningful is not the absolute , but the relative value of p(x(t)): if p(x(t)) < p(y(0)> ‘hen it does 
mean that, according or our knowledge, x(t) is much less probable than y(t). 

To make these comparisons easier, L. Zadeh proposed to use normalization, i.e., turn from 
u(x(t)) to p'(x(t)) = Nfi(x(t)), where a normalizer N is chosen in such a way that the maximal 
value of p'(x(<)) is equal to 1 (i.e., N = l/(mnx/dx(t)))). 

Comment. Theoretical explanations of this choice of a normalization are given in [KQLFLKBR92] 
(in the framework of a general mathematical foundation scheme for fuzzy logic). 


421 


3. FUZZY DESCRIPTION OF RELEVANT EXPERTS KNOWLEDGE 
AND RESULTING REGULARIZATION 


Let’s first list the possible experts statements. 

1) Usually experts can give the approximate range of the process x (t), i.e., they can give a number 
M for which “most likely, for every t the value of |x(l)| is limited by M” 

2) Usually they can also give some approximate bounds foT the rate, with which the values of 
x(t) can change, i.e., they can give a number A, for which “most likely, for every t, the value 
of |x(t)| is limited by A”. 

3) Sometimes, the experts’ knowledge and/or intuition can also prompt the approximate bounds 
for the second time derivative of the process (acceleration), and bounds for some higher deriva- 
tives. For each of these derivatives, an expert gives a value A, and states that “most likely, 
for every t, the value of |x (,) (t)l is limited by A” (here x (i) (t) denoted i-th derivative). 

4) Experts can also give some information about the possible measurement errors, i.e., about the 
values n(t) = y{t)~ / fc(t, s)x(s) ds, where y(t) are the measured values. In this case, an expert 
gives a value 6, and states that “most likely, for every t, the value of |n(t)| is limited by 6” 

In addition to that, we have some measurement results y(t), and these measurement results 
determine a set X of all the functions that are consistent with them. For example, if we know the 
maximal possible value £ of a measurement error n(t), then X consists of all the functions x(t) that 
satisfy the inequality |y(t) - / k(t,s)x(s)ds\ < e for all t. 

We want to represent the expert knowledge in terms of a membership function that is defined 
on this set X. 

We cannot directly translate these statements into membership functions, so we need 
an additional approximation process. Each of these statements refers not to a single value of 
some variable, but to infinitely many values, namely, to the values of x(t) for all possible moments 
of time t. So, if we write down all the resulting elementary statements, we wifi end up with in- 
finitely many such statements. So, to get a membership function that coresponds to the resulting 
knowledge, we must apply an “and” -operator to infinitely many membership functions, that corre- 
spond to infinitely many elementary statements. But we know only how to apply “and”-operator 
to finitely many functions. 

In order to cover the infinite case, we will apply the usual mathematical method of dealing 
with infinities: we will first consider the case, when the experts statements are applicable only to 
finitely many points tj,...,t„, and then tend n to infinity in such a way that in the limit these 
points t{ are everywhere dense. One of the natural possibilities to do that is to choose t < = <o + »A, 
where h > 0, and then take t 0 -* -oo, h -* 0, and n -* oo in such a way that t n = t 0 + nh — +oo. 

The resulting membership function: derivation. Let us apply this procedure and compute 
the resulting membership function. The readers who are interested only in the final result can skip 
this subsection. 

Let’s first consider the case, when the only experts knowledge consists of the bounds M and A 
on |x(t)| and |x(t)l. Then for each t the corresponding membership functions are exp( -/3|x(t)| 2 /Af 2 ) 
and exp(-/?|x(t)| 2 / A 2 ). Therefore, if we take into consideration these statements for t = tj, . 
ti = <o + ih, the resulting membership function will be equal to the product of these membership 
functions, i.e., will be equal to the following expression 


422 


m*( 0) = n? =1 ttpi-mutffM 1 ) x nu exti-mm 2 /a j ). 

Comment. We are restricted to the set X of ail functions x(t) that are consistent with the measure- ' 
ment results Therefore, the above xpressjon for /i(x(t)) is valid only for such functions i(t). AU 

functions x(t) that are not consistent with the measurement results are impossible i e if r(t\ d Y 
then n(x{t)) = 0. H ’ i.e., n £ A , 


Since ex p(a) x exp(6) - exp(a + 6), we can simplify the expression for u(x(t)) as follows- 

fi(x{t)) = exp(-(/3/M 2 ) E” =1 |x«,-)| 2 - (/3/ A 2 ) |i(t,)p). 

What happens when n - oo? If we multiply the sum £“ =1 |x(^)l 2 by A = t i+l - u, we net an 
integral sum for the integral / |x(t)| 2 dt. These integral sums tend to this integral, when A -► 0 
Hence, for small A, this sum is approximately equal to A -1 / |z(t)| 2 dt. Therefore, the membership 
function is approximately equal to the following expression: 

/**(*(*)) « exp(-(/3/h)J(x{t))), 

where 


J(x{t)) = M- 2 / |*(f)| 2 dt + A" 2 / |i(f)| 2 


When A 0, (0/h)J(x{t)) -► oo, and, therefore, p h (x(t)) « exp(-(j3/k)J(x(t))) -* 0. Therefore 
if we apply a transition to a limit, we end up with a meaningless expression fi(x(t)) = 0. 

In order to get a reasonable limit membership function p(x{t)), we must apply the nor- 
malization procedure before going to a limit. In other words, we must transform «*(*(*)) into 
/4(s(<)) = iV/x h (x(t)), where N = 1 /(max x(t)eXtlh ( x (t))). M { )) mt ° 

f the value of p h (z(t)) is the biggest when the value 

of J(x(t)) is the smallest possible. So, if we denote by m the smallest possible value of the 

that max ^x^(x(t)) = exp(-(l3/h)m). Therefore, 

Now we are ready to describe the membership function fi(x(t)) = lim- l -. 0 uUx(t)) that cor 
responds to the tout /> -, 0 !f , hm = 1,1 .he^^WO) = , jr 

S/AV J 1 f * m ’ 77’ SmCe 18 aminimum of we get /(*(*)) > m, therefore 

(p/A)(y(z(<)) - m) -» oo, and hence, /^(xff)) -♦ 0 as A -» 0. 

As a result, we get a crisp membership function that corresponds to Tikhonov’s re«r- 
ulanzation. m .’ we h ™ e M*(0) = 0. So, although we started with fuzzy statement 

and fuzzy membership functions, the resulting membership function is crisp: it is equal either to 1 
or to 0 depending on whether the functional J(x(t)) attains its minimum at x(t) oTnot. Hence in 
this case, we do not need any defuzzification procedure: we just pick a function x(t) from V for 
which J(x(t)) attains its minimal value. 

What if the experts can also give some bounds on the second and higher derivatives 

for th C PrOCe8S x(t }‘ In Ca f an ***** g ives Miniates A, for i-th derivative and/or a bound 6 
[? the measurement error, the resulting membership function is the same, with the only difference 
that additional terms are added to J(x{t )): A~ 2 J(*C0)2 dt in case flf ._ th derivative ^ 

- f - : (t,s)x(s)ds) 2 dt 

in case of an error bound. 


423 


How to solve inverse problems: resulting procedure. As a result, we arrive at the following 
methods of solving inverse problems: 

1) ask an expert to give approximate bounds M for |ar(<)| said A for |i(f)|; If possible', get also his 
bounds A* for i-tb derivative |x (i) (t)|, and 6 for the measurement error |y(f) — / k(t,s)x{s)ds\; 

2) from all the functions that are consistent with the measurement results, choose a function x(t) 
for which the functional J(x(t)) attains the smallest possible value. In case the expert gives 
only the estimates M and A, J(x(t)) — Jo(*(*)) + *M*(0)* where Jo(x(t)) = M~ 2 / |x(t)| 2 dt 
and Jj(i(t)) = A -2 / |x(t)| 2 dt. In case he gives bounds for i-th derivative and/or for errors, 
we must take J(x(t)) = Ji(x(t)) + Je(x(t)), where for i > 1 Ji(x(t)) = A,~ 2 f(x (i >(t)) 2 dt 
and J e (x(t)) = 6~ 2 J(y(t) - J k(t,s)x(s)ds) 2 dt. 

We can use ready-made software. The resulting method turns out to be a particular case of 
the Tikhonov’s regularization scheme. Therefore we do not need to design any new software: we 
can use the techniques, algorithms, and programs, that have already been developed for Tikhonov’s 
regularization. 

If the only thing we have done is justification of a well-known method, then what’s 
the buzz? Our proposal to use Tikhonov’s method has two advantages over the usual heuristic 
suggestion to use it: 

i) Tikhonov’s method is semi-heuristic, while we derived our method from the fuzzy formalism; 

ii) we do not need any heuristic ruie of choosing cu, because we have explicit expressions for these 
parameters in terms of experts’ bounds. 

Therefore, we avoid the problem of Tikhonov’s regularization that different heuristic rules lead 
to different values of a{ and, therefore, to different solutions z(t). 

4. CONCLUSIONS 

Suppose that we must reconstruct x(t) from the measurement results y(t), and the problem 
is ill-posed in the sense that drastically different functions i(t) are consistent with the same mea- 
surement results. Such problems are very frequent in geophysics, astronomy, image processing, 
etc. Suppose also that the only additional information that we have about the process x(t) is the 
experts estimates M and A for which the experts say that “most likely, for every t the value of 
|x(t)| is limited by M,” and “most likely, for every t, the value of |i(t)| is limited by A”, where 
x(t) denotes the rate with which x(t) changes (i.e., in mathematical terms, time derivative of x(t)). 

Then iu.-’y representation of this uncertainty leads to the following method of using this 
experts’ knowledge: from all the functions that are consistent with the measurement results, we 
choose a function x(f), for which the functional J(x(t)) takes the minimal possible value, where 
J(x(t)) = M~ 2 f |x(t)| 2 dt + A" 2 / |i(t)| 2 dt. 

Similar functionals can be described for the cases, when bounds for higher derivatives and/or 
measurement errors are known. 

The resulting method turns out to coincide with a particular case of the general Tikhonov’s 
regularization approach. This approach has already been implemented in software, and it has been 
successfully tested on numerous real-life ill-posed problems. 


424 


The advantage of o„ approx „ that eolv, nmih pt»htaae of TihhmmV. vegd^ztdlc: 

acknowledgements 

, kv, , ntqf Grant No CDA-9015006, NASA Research Grant No. 9-482 

and^S^^ 

££££ iatejligeht Control C^repce (Louiaville, Kent™*,). «P«ei»lly 

to L. Zadeh, S. Smith, and J. M. Barone, for valuable discussions. 

references 

|B92) J. M. Barone Fnrrpfenat «,.«re. nnd/nrrp entnepp- 
Fuzzy Systems and Intelligent Control Conference, Louisville, KY, 1992, pp. 170- . 

a Bartolini, G. Casalino, F. Davoli, M. Mastretta, R. Minciardi, and E. 

stfn 

Amsterdam, 1985, pp. 73 86. 

[CB86] I. Craig and J. Brown. Inverse problems in astronomy. Adam Hilget Ltd., Bristol, 
1986. 

[G84] V. B. Glaako. Invent prMem, o! metanrliool phpeie.. American Inatitnt. of Phymcm 
N. Y., 1984. 

IHC76] H. M. Hersch and A. Caramazza. A fuzzy-set approach to modifiers and vagueness in 
natural languages. J. Exp. Psychol.: General, 1976, Vol. 105, pp. 254 276. 

[183] Inverse problems. SIAM-AMS Proceedings, Vol. 14. American Mathematical Society, 
Providence, RI, 1983. 


[186] Inverse problems. Birkhauser Verlag, Basel, 1986. 


[lOUJ — 

[I86a] Inverse problems. Lecture Notes in Mathematics, Vol. 1225, Springer-Verlag, Berlin- 
Heidelberg, 1986. 

[K75] A. Kauffman. Introduction to the theory of fuzzy subsets. Vol. 1. Fundamental theoretical 

elements, Academic Press, N.Y., 1975. 

1KKM881 V Kozlenko, V. Kreinovich, and M. G. Mirimanishvili. The optimal method of 
describ^the expert information. In: Applied problems of systems analysis. Proceedings Georgian 
Polytechnic Institute, 1988, No. 8, pp. 64-67 (in Russian). 

[K83] V. Kreinovich. Foundations of the Maslov’s operator. In: 

National Conference on the Applications of Mathematical Logic, Tallinn, 1983, pp. 80-81 (in 

Russian). 

[K87] V. Kreinovich. Semantics of Maslov’s iterative method. In: Problems of Cybernetics. 
Vol. 131, Moscow 1987, pp. 30-63 (in Russian). 

[K891 V. Kreinovich. The optimal choice of formulas of fuzzy logic. Center for New Inform* 
tional Technology "Informatika”, Leningrad, Technical Report, 1989 (m Russian). 


425 



[K89a] V. Kreinovich. Optimization in case of uncertain optimality criteria: group-theoretic 
approach. Center for New Informational Technology “Informatika”, Leningrad, Technical Report, 
1989 (in Russian). - 

(K90] V. Kreinovich. Group-theoretic approach to intractable problems . Lecture Notes in 
Computer Science, Springer, Berlin, Vol. 417, 1990, pp. 112-121. 

[KK90] V. Kreinovich and S. Kumar. Optimal choice of &- and V- operations for expert 
values in: Proceedings of the 3rd University of New Brunswick Artificial Intelligence Workshop, 
Fredericton, N.B., Canada, 1990, pp. 169-178. 

[KL90] V. Kreinovich and A. M. Lokshin. On the foundations of fuzzy formalism: explaining 
formulas for union, intersection, negation and modifiers. University of Texas at El Paso, Computer 
Science Dept. Technical Report UTEP-CS-90-28, October 1990. 

[KQL91] V. Kreinovich, C. Quintana, and R. Lea. What procedure to choose while designing 
a fuzzy control f Towards mathematical foundations of fuzzy control. Working Notes of the 1st In- 
ternational Workshop on Industrial Applications of Fuzzy Control and Intelligent Systems, College 
Station, TX, 1991, pp. 123-130. 

[KR86] V. Kreinovich and L. K. Reznik. Methods and models of formalizing prior information 
(on the example of processing measurements results). In: Analysis and formalization of computer 
experiments. Proceedings of Mendeleev Metrology Institute, Leningrad, 1986, pp. 37-41 (in Rus- 
sian). 

[KQLFLKBR92] V. Kreinovich, C. Quintana, R. Lea, O. Fuentes, A. Lokshin, S. Kumar, I. 
Boricheva, and L. Reznik. What non- linearity to choose? Mathematical foundations of fuzzy 
control. Proceedings of the 1992 International Fuzzy Systems and Intelligent Control Conference, 
Louisville, KY, 1992, pp. 349-412. 

[KQR92] V. Kreinovich, C. Quintana, and L. Reznik. Gaussian membership functions are most 
adequate in representing uncertainty in measurements. University of Texas at El Paso, Computer 
Science Department, Technical Report, March 1992. 

[KM87] R. Kruse and K. D. Meyer. Statistics with vague data. D. Reidel, Dordrecht, 1987. 

[LRS86] M. M. Lavrentiev, V. G. Romanov, and S. P. Shishatskii. Ill-posed problems of math- 
ematical physics and analysis. American Mathematical Society, Providence, HI, 1986. 

[077] G. C. Oden. Integration of fuzzy logical information. Journal of Experimental Psychol- 
ogy: Human Perception Perform., 1977, Vol. 3, No. 4, pp. 565-575. 

[KR86] V. Kreinovich and L. K. Reznik. Methods and models of formalizing prior information 
(on the example of processing measurements results). In: Analysis and formalization of computer 
experiments, Proceedings of Mendeleev Metrology Institute, Leningrad, 1986, pp. 37-41 (in Rus- 
sian). 

[TA77] A. N. Tikhonov, V. Y. Arsenin. Solutions of ill-posed problems. V. H. Winston & Sons, 
Wasnington, DC, 1977. 

[YIS85] 0. Yagishita, O. Itoh, and M. Sugeno. Application of fuzzy reasoning to the water 
purification process, in: M. Sugeno (editor). Industrial applications of fuzzy control. North Holland, 
Amsterdam, 1985, pp. 19-40. 

[Z65] L. Zadeh. Fuzzy sets. Information and control, 1965, Vol. 8, pp. 338-353. 

[Z78] H. J. Zimmermann. Results of empirical studies in fuzzy set theory. In: Applied General 
System Research (G. J. Klir, ed.) Plenum, New York, 1978, pp. 303-312. 


426 



295 

UNCLAS 


/ 6 v 8 N93-295645' 


QUANTIFICATION OF HUMAN RESPONSES 

R.C. Steinlage*, T.E. Gantner*, P.Y.W. Lim** 


/ — Human perception is a complex phenomenon which is difficult to 

/ quantify with instruments. For this reason, panels of several or many people are often 
/ used to elicit and aggr egate subjective judgments. Print quality, taste, smell, sound 
/ quality of a stereo system, softness, and grading Olympic divers and skaters are some 
/ examples of situations where subjective measurements or judgments are paramount. We- 
/ usually express what is in our mind through language as a medium but languages are 

I limited in available choices of vocabularies, and as a result our verbalizations are only 

I approximate expressions of what we really have in mind. For lack of better methods to 

| quantify subjective judgments, it is customary to set up a numerical scale such as 1, 2, 

! 3, 4, 5 or 1, 2, 3, ... , 9, 10 for characterizing human responses and subjective judgments 

i with no valid justification except that these scales are easy to understand and 

convenient to use. But these numerical scales are arbitrary simplifications of the 
complex human mind; the human mind is not restricted to such simple numerical 
variations. In fact, human responses and subjective judgments are psychophysical 
phenomena that are fuzzy entities and therefore difficult to handle by conventional 
mathematics and probability theory. The fuzzy mathematical approach provides a more 
realistic insight into understanding and quantifying human responses. This paper 
presents a method for quantifying human responses and subjective judgments without 
assuming a pattern of linear or numerical variation for human responses. In particular, 
quantification and evaluation of linguistic judgments was investigated. 


The method used to code responses obtained from panelists is 
especially important when one wishes to make decisions concerning properties or events 
which are not objectively quantifiable but which must be evaluated subjectively. The 
problem of coding such responses has been addressed from many directions. In this 
paper we propose a technique, based in fuzzy mathematics, for quantifying and 
evaluating subjective responses and then we test our technique in situations where the 
properties are also objectively measurable. By testing our technique in objective 
situations, we hope to lend credibility to its use in purely subjective situations. The 
technique we describe is a refinement of techniques originally proposed by Saaty [4-8]. 


Saaty [4-8] proposes using five adjectives as “response words” in subjective panel 
tests. These words indicate that two samples are indistinguishable with respect to a 
given property or that the difference between them is slight, moderate, significant, or 
extreme (or absolute). Of course, panelists sure permitted to hedge their bets and cast 
their ballots between two such judgments. 


Thus Saaty is proposing a 9 point scale for linguistic or subjective judgments as 
illustrated below. The illustration is stated in terms of physical weight although the 
particular property is irrelevant. 


427 



1 - A and B are equally heavy 

2 - 

3 - A is slightly heavier than B 

4 - 

5 - A is moderately heavier than B 

6 - 

7 - A is significantly heavier than B 

8 - 

9 - A is extremely heavier than B 


The integers 2, 4, 6, 8 represent compromise judgments between two of the 
above odd numbered positions. As Saaty states, this is a good scale in that it provides 
enough shades of meaning without expecting a panelist to be scrupulous. 

After obtaining panel data, the next problem is the analysis of this data. Aside 
from the usual statistical analysis, a technique that has been shown to be successful in 
fuzzy or subjective situations is to find the dominant eigenvalue and associated 
eigenvector for the reciprocal matrix of paired comparisons. This analysis is based on 
the work of Perron and Froebenius [1]. If n objects Aj, ... , A n are being compared, 
these are listed horizontally and vertically to indicate the rows and columns of a matrix 
M. If A- is judged to be significantly heavier than A-, then a 7 is placed in row i, 
column j and 1/7 is placed in row j, column i : J 




j 

n 



7 



n 


If our objective is to determine the respective weights of n objects, then the 
resulting eigenvector should indicate the relative weights. If we have perfect 
information (no judgments are necessary and responses are not restricted to integers and 
their reciprocals) we could simply fill in the matrix using the ratio of the respective 
weights: m- = Wj/wj. We then obtain a reciprocal matrix: m- = w j/ w i = l/ m p 


428 



It can be shown that A = n is the only non-zero eigenvalue for M and that W - 
/- w n ) is its associated eigenvector; the correct weight determination is indeed 

obtained* as uie eigenvector. This eigenvector is unique up to a scalar multiple. 

If the experiment was needed, however, perfect information is not available at 
the outset BuHl the responses are a reasonable approximation to the reality of the 
situST then the responses will approximate those which would have been placed m 
the “perfect information” matrix. Hence the eigenvalue should approximate n the 
number of samples) and the associated eigenvector should approximate the actual 
numoer oi ' (weight etc.) among the samples. Thus the eigenvector not 

XToormaiized so that v‘ ¥ ... + '1 = 1, vj indicates the percentage of the totaj 
/ • nossesS ed bv object 1 . The eigenvalue A is a measure of the consistency of the 

by the panelist. A gool rule of thumb is that if A > n + , the pauehst 
has P contradicted himself or herself so many times and/or so egregiously that Ins or her 
responses should be ignored. On the other hand, if A is very close to n, the panelist was 
very consistent (although not necessarily accurate or correct). In short, the eigenvalue 
is a good flag to indicate errors in recording data; e.g., a nmnber and its reciprocal may 
be interchanged, The 1 to 9 scale does conform well to linguistic comparisons m the 
sen f tSt Tt fdlows one to discriminate simultaneously on 9 = 7 + 2 levels. This is the 
mSnmn number in the range 7 ± 2 of simultaneous . comparisons that an individual 
can keep in mind without becoming confused; see Miller (3). If a scale much larger 
thm9b used, the differences in reciprocals become neghgible and some discrimination 
between samples in the resulting eigenvector will be lost. A. collection of objects in 
which the samples may be too widely diverse should be subjected to a hierarchical 

analysis [4-8]. 

However, our experience indicates that while the above 1 - 9 scale may be 
appropriate for eliciting and coding human responses, it is not always the proper scale 
to be used in the ensuing matrix analysis. In fact, the scale used will be refle< ted m the 
results The largest number used is in essence the ratio between the strongest and 
weakest (or heaviest and lightest, etc.) objects m the resulting eigenvector. Thus an 


429 




inappropriate numerical scale will lead to undesirable end effects concerning the 
extremes of the objects being compared. This end effect is extremely volatile when 
computing percent error on the low end. Our experience indicates that a linear 
rescaling of the 1 - 9 linguistic scale to a scale determined by the accepted or perceived 
ratio of the two extreme objects in the given group significantly reduces this end-point 
effect. 


An F/ramnle: Consider 6 weights w^ = 2, = 4, w^ = 6, w* = 8, Wg = 10, 

and w fi = 12. The matrix Mg = [w-/w-] = [my] is the matrix of perfect information, 
and the integer entries of this matrix rarige from J l to 6. In this case, the weight ratios 
w / w = w./wn = w fi/ w 3 = 2 all indicate that the numerator weight is twice that of 
the denominator, which is quite different from the linguistic use of the number 2 in the 
above 1-9 scale. The linguistic 2 says two samples are almost indistinguishable. The 
dominant eigenvalue of Mg is 6 and its unit eigenvector is Vg = (w,/w, ... , w g /w), 
where w = w, -f ... + Wg = 42. We linearly rescaled the integer entries in Mg to a 
1-9 scale to get a reciprocal matrix Mq, as well as to a 1-3 scale to get a reciprocal 
matrix Mo. The unit eigenvectors Vg and V,, respectively, corresponding to the 
dominant eigenvalues of Mg and Mo generate weight vectors 42Vg and 42V 3 . These are 
displayed in the table below. In all cases the eigenvalue A was less than 61005. These 
low eigenvalues merely indicate consistency, not agreement with experimental 
measurements. 


PERFECT INFORMATION 
VARIOUS SCALES 


42V 3 42Vg 42Vg 

% error actual % error 


W 1 

3.570 

78.5% 

2 

1.344 

-32.8% 

w 2 

5.418 

35.5% 

4 

3.192 

-20.2% 

w 3 

6.720 

12.0% 

6 

5.376 

- 10.4% 

w 4 

7.812 

-2.4% 

8 

7.89 

- 1.3% 

w 5 

8.778 

-12.2% 

10 

10.626 

6.3% 

w 6 

9.702 

-19.2% 

12 

13.566 

13.1% 


430 


I 

■ A 


In Mg, the 1-9 scale exceeds the actual maximum weight ratio w g/ w j = 6. As 
a result the eigenvector scale overestimates the heavier weights and underestimates the 
lighter weights. In Mo, the 1-3 scale falls short of the maximum weight ratio wg/w, 
= 6. As a result £ne eigenvector scale underestimates the heavier weights°anu 
overestimates the lighter weights. The spread between the two extremes is too large 
with a scale of 1 - 9 and too small with a scale of 1-3. However, all three scales 
provide the proper ordinal ranking of the weights. 

As a practical test of our theory we duplicated Saaty’s weight test on five 
dissimilar objects of various sizes, shapes, and weights: 


1. Ski Boots 

2. Radio 

3. Iron 

4. Jug of Wax 

5. Pile of Kindling 


8 lb. 3 oz. 

2 lb. 13 oz. 

3 lb. 4 oz. 
7 Lb. 9 oz. 
6 lb. 4 oz. 


Pairwise comparisons of these objects were made using the linguistic 1 to 9 scale 
and the corresponding reciprocal matrix was generated. The results as compiled below 
are distorted significantly from the actual weight distribution. Nevertheless the 
eigenvalue A = 5.30 is rather low. Again, this low eigenvalue indicates consistency of 
the responses - not necessarily accuracy of the predictions. 

On the other hand if we observe that the maximum ratio is Boots/Radio = 
8.1875/2.8125 = 2.9111 ~ 3, we see that a maximum ratio of 3 (as opposed to 9) 
might have been better. Rescaling the original observations linearly to 1 - 3 from 1-9 
changes the results considerably. These results too are tabulated below; they are seen 
to be much more acceptable. 


•> 



WEIGHT TEST 


W 1 

w 2 

w 3 

w 4 

w 5 


Scale 1-9 


Scale 1-3 


Computed Actual 

Weight Weight 

% error 


Computed 

Weight 

% error 


8.7 

6.4% 

8.1875 

7.58 

-7.3% 

0.84 

-70.1% 

2.8125 

2.75 

-2.1% 

1.4 

-56.9% 

3.25 

3.06 

-5.8% 

11.79 

56.0% 

7.56 

8.47 

12.0% 

5.33 

- 14.7% 

6.25 

6.23 

-0.3% 




431 




i 




ertM 

would have been permitted. iesp onse for that matter) is 

An alternative to &SZ Resents e<p£ity of the sampler and 
toad's ^rSfrSrem. dominance of one sample over the ot er. 


_|B 


A over B 


Equality 


B over A 


Using such a bar graph abibty 

desired. We used bar graphs oUhis t^) . g w hen the total magnitude was also 

of panelists to ascertain sm.-dl differen.^ aoT) licability of this process to situations in 
P n T n short we wanted to test the PP nrocess be “fine tuned to indicate 

SSn where the property purily objective sitnatrons. The 

are described below. 

* F.vnerimeatl The thickness^ 

Caliper U usually oftte non-uniformity »f £ 

^r%‘c^.-all 

the caliper of that sample. Th " s ,.^\ P reDresen tative of the run of paper from which 

ssstfr £&&r'53 «fu* 

subjective judgments, we attempt^ to d^temnn ^ d with the 

'nl^jraiu^of ttSne tuples under laboratory condrtrons. 

In all, 29 panelists participated 1 in T Ato tS prevent 

chosen so that caliper was essentially the ojly , g l ue d down to uniform metal 

ShTfiS® from iufhtenangthe of the block on all rules. 

uot affec/the evaluation. 

Interpreting the panel responses^ on the ^ts^mseuMd are averages over 27 
“S£ ^rI^.“Sder.rio„ because of high eigenvalues,. Cahper ,s 

given in 1/1000’s of an inch. 


432 


CALIPER TEST 
Scale 1-9 


Paper 

Average 

Computed 

Sample 

Caliper 

Caliper 

1 . 

3.26 

1.80 

2. 

6.35 

4.99 

3. 

5.63 

4.03 

4. 

10.24 

13.04 

5. 

4.29 

3.99 

6. 

8.03 

9.94 


Relative 

Error 

-44.7% 

-21.4% 

-23.1% 

27.4% 

7.0% 

23.8% 


Note that almost all errors axe large but that the largest error occurs at the low 
end and that the next largest error occurs at the high end. The sizes of these errors 
would seem to severely limit the applicability of the process to situations m which such 
delicate differences occur and are to be detected. On the other hand, the maximum 
ratio in average measured calipers is 10.24/3.26 = 3.14. Reinterpreting the original 
panel data on a 1-3.14 scale improves the results significantly. The results are 
tabulated below. An extra column is included to indicate the variations in the several 
caliper measurements taken to obtain the “Average Caliper for the given paper sample. 


CALIPER TEST 
Scale 1-3.14 


Paper 

Average 

Computed 

Relative 

Inherent 

Variation 

Sample 

Caliper 

Caliper 

Error 

in Sample 

1 . 

3.26 

3.09 

-5.2% 

± 4.3% 

2. 

6.35 

5.99 

-5.7% 

± 8.7% 

3. 

5.63 

5.17 

-8.2% 

± 6.2% 

4. 

10.24 

10.51 

2.6% 

± 1.3% 

5. 

4.29 

4.48 

4.4% 

± 4.7% 

6. 

8.03 

8.55 

6.5% 

± 3.7% 


By rescaling the interpretation from 1 — 9 to a 1 — 3.14 scale, the distortion at 
the low and high ends has been removed. In samples 2 and 5, the error is no larger 
than the variation inherent in the sample itself; there is thus effectively no error m our 
computed caliper. In no case is the error more than double the variation in the sample. 
We think this kind of accuracy obtained from subjective non-quantified judgments is 
astounding It should be pointed out that sample variation does seem to be related to 
the resulting errors . i computed caliper. The lowest sample variation (1.3%) 
corresponds to the lowest experimental error (2.6%). Thus it would seem that half the 


1 ic attributable to variation in the samples themselves. The error 

experimental and the subsequent computational process is no 

rtkh i. contributed by variation. w,thm the samples. 

W. remind the reader again that we axe 

to measure objectively quantifiable prop we jght and caliper so as to lend credibility 

Aagito 5t Sted1n?trmimtS apprSchmhale been developed, but 

matter for which many sophisticated Jf t h e final eva i uat ion. In [ 2 ] non- 
human perception is stil ^ ** studied P using paired comparisons elicited from 

impact printer image quahti , • reSDOnses were both used for transcribing the 

panels. the technique; described in this 

panel responses. The exoeriment was to test the applicability of the process 

paper. The purpose of the c p , . , ■ differences must be determined. The 

outlined in this paper ^ sitrations the oanelists were able to give consistent responses 
resulting eigenvalue indicated that was not a problem, 

in making the paired gave results consistent 

The , ei.env.cto , -M Sa^ptowhich were ranked a, 

ss z 

Station of Print quality closet to the practical market, ng s.t»at,on. 

demand far greater lelative a y movJe rties. Early indications obtained in a 
paired comparison raiSes \[ refined P For instance, the difference between a 

»ia a tripling of business volume for a small 

contender. More precise results would be requested. 

Many — t* m 

subjective responses However ^ utatioual pr0C ess which follows the subjective 

S^-H^hvWTfSS.- Pfv 5 i 

SS'vSo, Too lST. rcJo'Jdtn in the extremes being separated too far; too small 
a scale brings the extremes too close togethei. 

Tahinv greater care in making the comparisons cannot correct for this distortion 
if an impropriate scale is chosen. This distortion is inherent m the computational 

process. 


434 



, n _„i u-oio « W ell For each column of the matrix 
This observation has a the< ? r ® tic .^ ' b “‘ the (normalized) dominant eigenvector, 
(when normalized) is an approxu approximate the maximum ratio m the 

Thus the maximum ratio in any {£■ lightest (thinnest, etc. sample all 

eigenvector. In the column £ It least greater than or equal to 1 on a 

entries will be integers on a lmgmsto seal r g io in this column should^ be 

continuous (bar graph) scale. T J extre meT in the resulting eigenvector. This 
KSt & STnJoilt in the examples and experiments described m the body of 

this paper. 

Thus in any panel test involving paired comparisons there are two distinct 
: Using an acceptable scale fi* the ^ 

'“fiy'ac" 5 Tn"n tt. rule is that if the samples are too divers-, a 
hierarchical analysis would be in order. , . , 

2. The choice of ttb^S^ ifST * 

t h e^^^r^^t e c^*aty scalemi^t be 

SytsTs SSd*d n*eS^Xi presumably this ratio cannot he obtained objectrvely. 

Nevertheless, since some hind of jjrf-j* “r'm^be^Sf An 
marketing everts could mdicate tuatju ^ that it readily lends itself to arbitrary 
advantage of the bar graph PP , , quickly be processed for several scales and 

■ •" withott ‘ r “ iairto5 

input from panels. 

However, some outside jndpn.nt.wUl £ 

computed spread between the «trem J^^h says that in trying several scales, we 
There is no critical P om \“ ? lt,c i onlv cl i tic al print occurs with the minimum scale 

no distinctions whatsoever. 


. Department of Mathematics, University of Dayton, Dayton, Ohio 

** flSow STh^^DWrionXion Camp Corporation, Franklin, Virginia) 


435 




13 


bibliography - _ 

111 R.E. Bellman, Introduction to Matrix Analysis, second edition, 
McGraw - Hill, New York, 1970. 

( 21 P. Lim, R.C. Steinlage andXE. Gantner, AjgWta. of tezy 

clnfereLe, Vancouver BC, Canada, November 

1990, pp. 79 - 83. 

G A Miller The magical number 7 ± 2: some limits on our capaaty for 

processing infonnation. Psychological Revietv, vol. 63, 1956, PP . 8! - 97, 

I4| T.L. Saaty, A scaling method for P™gjes “ hierarchies structures. 
Journal of Mathematical Psychology, vol. 15, 1977, pp. 234 ZS1. 

|51 T.L. Saaty, Exploring the interface between !^cl«., multiple 
objectives, 'L fussy sets, Fussy Sets and Systems, vol. 1, no. 1, 1978, pp. 68. 

Resource ^ ^ 

CA, 1982. 

„ , ip Vareas The Logic of Priorities: Applications in 

Business, Inergy, Health! mi Transportation, Kluv.er-Nijhoff, Boston, 1982. 


: H 

i . 




!i 

' \ 
: t 

i 


H 

(■ 

it - 


436 




UNCLAS 



- ^ 
/£>v c / 



NON-SCALAR UNCERTAINTY N98"29565 

Uncertainty in Dynamic Systems 

Salvador Gutidrrez Martinez 
Instituto Tecnoldgico de Morelia 
Ave. Tecnoldgico 1500, Lomas de Santiaguito 
58120 Morelia, Michoacdn. 

Mexico. 

FAX (451) 216-43 


Abstract 

The following point is stated throughout the paper: Dynamic systems are usually subject 
to uncertainty, be it the unavoidable quantie uncertainty when working with sufficiently small 
scales or when working in large scales uncertainty can be allowed by the researcher in order to 
simplify the problem, or it can be introduced by non-linear interactions. Even though non- 
quantic uncertainty can generally be dealt with by using the ordinary probability formalisms, it 
can also be studied with the proposed non-scalar formalism. Thus, non-scalar uncertainty is a 
more general theoretical framework giving more insight about the nature of uncertainty and 
providing a practical tool in those cases in which scalar uncertainty is not enough, such as when 
studying highly non-linear dynamic systems. This paper’s specific contribution is the general 
concept of non-scalar uncertainty and a first proposal for a methodology. Applications should 
be based upon this methodology. The advantage of this approach is to provide simpler 
mathematical models for prediction of the system states. 

Present conventional tools for dealing with uncertainty prove insufficient for an effective 
description of some dynamic systems. The main limitations are overcome abandoning ordinary 
yabr algebra in the real interval [0,1] in favor of a tensor field with a much richer structure 
and generality. This approach gives insight into the interpretation of the Quantum Mechanics and 
will have its most profund consequences in the fields of elementary particle physics and 
nonlinear dynamic systems. Concepts like "interferring alternatives" and "discrete states" have 
an elegant explanation in this framework in terms of properties of dynamic systems such as 
strange attractors and chaos. 

The tensor formalism proves specially useful to describe the mechanics of representing 
dynamic systems with models that are closer to reality and have relatively much simpler 
solutions. It was found to be wiser to get an approximate solution to an accurate model than to 
get a precise solution to a model constrained by simplifying assumptions. Precision has a very 
heavy cost in present physical models, but this formalism allows the trade between uncertainty 
and simplicity. 

It was found that modeling reality sometimes requires that state transition probabilities 
should be manipulated as nonscalar quantities, finding at the end that there is always a 
transformation to get back to scalar probability. 


Non-scalar Uncertainty 


Introduction 

About 60 years ago, after strong experimental evidence in the field of elementary pafficle 
physics it was realized that probability theory as defined by that time was insufficient to handle 
the unavoidable uncertainty in the behavior of microscopic physical systems. As stated by the 
late R.P. Feynmann "...the laws of probability which are conventionally applied arc quite 
satisfactory in analyzir ne behavior of a roulette wheel but not the behavior of a single electron 
or a photon of light." - "Quantum Mechanics and Path Integrals", R.P.Feynmann & A.R. 
Hibbs, McGraw-Hill, 1965. As a result various formulations of theories generally known as 
Quantum Electrodynamics and Quantum Mechanics were bom. These theories have proven to 
be enormously successful as predictive tools and are in this respect unchallenged to this day, 
though they have originated much controversy by their philosophical implications. Nevertheless, 
all of them overcome the limitations of probability to deal with the results of experiments. They 
do so because they invariably recur to algebraic structures much richer than the red interval 
[0,1]; all of them involve first working with complex or hypercomplex fields and 
multidimensional structures and then prescribe a transformation that restates the predictions in 
conventional probabilistic terms. It is not in the scope of this paper to state a formulation or 
description of any of these theories, for there are countless of them available in the subject’s 
literature. They are mentioned as a monumental example of the potential of multidimensional 
structures and complex fields in the treatment of uncertainty. 

The higher dimensionality and more complex operations 
involved in complex and hypercomplex fields are useful to 
generate the predicted patterns in the probability distributions of 
interferring alternatives. 

What are interferring alternatives can be illustrated by 
Young’s experiment (Figure 1), which is in this description a 
thought experiment that can be instrumented in more realistic 
settings. A source of particles (electrons or photons, whatever), 
emits them toward a screen, but between the source and the 
screen we place a barrier with two slits. If we make the beam so weak that it consists of a angle 
photon at a time, we could assume that a single particle would go through either slit and then 
it would be recorded at the screen. After a great number of particles have made their way one 
by one through the screen they would form a visible pattern on the surface which would 
represent the relative frequency (probability) distribution of a particle coming from the source, 
through the slits, reaching a certain point on the screen. This probability density function is 
represented by the curve at the right of the screen in Figure 1 and is unexpected, since it has 
many local maxima and minima as if it were recording the effects of waves instead of particles. 

If we could know with certainty that electrons come through either slit say, by blocking 
one of them, then we would record a probability density function more like the curve indicated 
in Figure 2. A similar result would be found when the other slit is blocked (Figure 3). If both 
distributions were independent from each other we would find that the probability density 
function that would be the sum of the previous two ones, giving a bell shaped curve. 



Figure 1 Young’s Experiment 


438 



Non-scalar Uncertainly 



Nevertheless, the observed frequency distribution is very 
different from the expected (me (Figure 4). This is what is meant 
by saying that the alternatives interfere, much in the way waves 
do, but the interference pattern is defined on the probability 
(frequency) distribution. Whenever two or more alternatives 
cannot be resolved by experiment, they always interfere. The 
difference between the observed and expected patterns is caused 

I i ■ ' . by the fundamental uncertainty described by Heisenberg’s 

Figure 2 Distribution of principle. Whenever we want to interact with the particles to find 
Particles through one Slit. out which way they came the interference pattern at the screen 

is blurred. If we would like to 
determine the particle’s path by getting it to interact with a 
photon or some other particle, then f’.ie disturbance produced by 
the sensing particles would be unavoidably too big to find out 
with precision where was that particle going, thus destroying the 
interference pattern. 


Dynamic Systems Subject to Uncertainty 



Figure 3 Distribution when 
the other Slit is Closed. 


The essence of dynamic 

systems is time dependency. When observing microscopic 
dynamic systems we can say that much of Heisenberg’s 
unavoidable uncertainty can be focused on the time variable, 
since most of the relevant variables are time dependent. In this 
kind of dynamic systems it is not possible to say with arbitrary 
precision that a given particle is in a certain well defined state at 
any precise moment, nor is it possible to say that it has a defined 
trajectory and there exist sets of time dependent variables whose 
precise value can not be known simultaneously, such as position 
and momentum. A similar argument holds for any Kind of 
dynamic system subject to uncertainty, specially for non-linear 
ones. This means that we are left with a system which can 
assume a set of states which can be either well defined or fuzzy, but we do not know in what 
state will the system be at a certain time. 

Because of this loss of information on time dependency we are forced to study dynamic 
systems disregarding the time variable; i.e., we are compelled to make tllUC independent 
statements about t he, states of a system. This is just the kind of statement that a fteouency 
(probability) distribution is: What states can a dynamic system assume and how likely is it to 
be in any of them given some determined boundary conditions. Because of uncertainty, 
observing the system by means of an experimental setting means that we will not generally 
observe the same outcome for identical repetitions of the experiment. So, all we can do is repeat 
experiments and measurements for a large number of times and then watch the relative frequency 



Expected Frequency Dist. 


439 



Non-scalar Uncertainly 


of the outcomes. Theoretical statements can only be made in probabilistic terms and 
confrontation with reality can only be made in terms of comparing predicted probabilities against 
observed relative frequencies. 

The theory of dynamic systems shows that there are some states that can be called 
•'equilibria", which means that once a system has reached one of them it tends to stay in it for 
a long time. If a system tends to abandon an equilibrium state at the slightest perturbation then 
this point is called "unstable". Of course, if we observe just the opposite, i.e. the system tends 
to stay in some state regardless the effect of small perturbations, then it is called a "stable 
equilibrium" . Of course, things in reality are not always that simple for we can find some special 
states around which the behavior of the system tends to wander. They are called strange 
attractors and can have a simple or very complex nature. The reader is referred to the vast 
literature on the subject to extend and clarify these concepts. 


If a dynamic system subject to uncertainty has strange attractors, they will show up in 
the frequency (probability) chart as a peak, band or concentration of points, since it will spend 
a considerable part of the time on them. These peaks look very much tike interference patterns 
when the dynamic system is defined by non-linear functions. 

This point can be nicely illustrated with a very well known example, the Verhulst Process 
(a population growth model, [Peitgen]). We make the following initial assumptions: 

x Q = Initial Population Size 
x m = Population Size after n years 
(x 

R = —ill — — s Relative Increase per year 


If this rate is constant (say V), then the iaw is: 

If R varies with population size, then R = r(l-xj, where r > 0 is the "growth parameter". 
Then, 

“ f (*»> = < 1+ r ) V '*« 1 

Then x<, = 0 and x 0 = 1 are equilibrium points. Analysis for0<Xo<<l,r>0 yields: 












Non-scalar Uncertainty 




Ilustr. 12 Interference Patterns in the 
Frequency Distribution 


As the Frequency Maps show (Figures 6,8,10 & 12), there are “interference" patterns 
apparent in the frequency distribution of states. As the growth parameter r increases the behavior 
becomes chaotic, but the "interference" still shows up. 


State Space 

State Space can be regarded as the "arena" where dynamic sytems perform and leave 
their trails and is defined as the set of all possible states a certain dynamic system can attain. 
The definition of state space presumes the definition of state variables and supports and their 
value sets. Reaching this stage is equivalent to climbing the first rung in a ladder of 
epistemologic levels [Klir 1, pp. 16, 33-64], defining a Source System; i.e., isolation of a 
system from reality to the point where everything is ready to perform observations and to get 
data. 


Definition of state variables and their nature almost defines the nature of state space. It 
only leaves now to define some other general properties of such a space -i.e., metrics, 
continuity, compacity, discreteness, order, invariance requirements, etc. 

There is no reason to suppose that any two dynamic systems should have the same state 
space, not even two distinct Source System definitions from the same dynamic system. This is 
why we need to define the essential properties which a state space should have in order to reach 
a meaningful methodology. 

1.- State space is a metric space S=(X, 5), where X is a set of elements (points) 
and S is a distance function satisfying 

a) 3(x,y)=0 < = > x=y, x, y € X, 

b) 5(x,y) = 5(y,x), x, y e X, 

c) 6(x,z) «£ 8(x,y) + 5(y,z), x,y,z c X 


442 



Non-scalar Uncertainty 



Cover Space 

product that agrees with the metric defined on S. 

Quantic Uncertainty 

wo mm valent to some extent. Basically, we can 

Quantum Theory formulations can . iaUy defined Hilbert space [Dirac, p. 

say that states of a system corre^ond wto^ defined on such spaces can be made 

51] or to wave functions [Landau, p. ]■ ^ ^ satisfy wme other requirements -being 

to correspond to dynamic van^f to physical -observables", or 

Hermitian or self-conjugate- they . Furthermore, if they have eigenvalues and 

quantities that can be measul ^°L^f an ? can be interpreted as the values assumed by the 
eigenvectors, then they arereal qua^m * when the system is in the state corresponding to 
dynamic variables associated to d probab iU t y distributions for states and values 

srs? - — - * — - vKror or Mve 

Action amplitude and then getting its square root. 

_ enhiect to an unavoidable and intrinsic kind of uncertainty 
Quantum phenomena Me Heisenberg in the 1920’s. Such uncertainty is 

manifest at atomic scales, stated first i Young’s -described previously in 
responsible for the unexpected results pe g t {Q find i t; i. e „ associated m some 

thispaper. It enters the form^isms where “ exponents of certain complex numbers. 

; a r. 25? Ctestal MKhani “ — “ 

be seen to have terms of die form 

l s 


constant; ih»f*«» ... . f OTma ii sms introduce tne eneci> u» , — 

level scales. Thus, quantum mechames ^ “ tity> ^ for addition of these 

uncertainty into the ph ^ J°™!°ble paths" in Lisition from one state to another, where^ths 
complex quantities over all possible paths ^ vel * ^ ftnal amplitude, whereas 

which are very close to each other contnb ending to cancel their contributions 

unlikely paths which require rcl ^®. y rF man „ L pp . 31-38]. It should be stressed that these 
out. This is R. Feynmann s approach [Fty , P ^ ^ at the end when one can 

processes operate on non-scalar ^ e em ' via ^ scalar product operation, when finding 

translate that complex amplitude jfjf very important to realize that quantum 

the modulus of the resulting amp ' patterns in frequency or probability distributions 

“to .nlpulaUng uncertainty as a scalar quan«y. Tac non-scalar namre 


443 



Non-scalar Uncertainty 


of amplitudes is what allows 

Thus, we may confidently stamtto ™S Sn^S XTStventional W putbshUit, 
S55fS£ ^^"Selst successful scientific pamdl.ms 


Non-scalar Uncertainty 


It should be realised that at least in ^mt^ue’blumd 1 ^ 6 ^^ 

patterns in probabiluyorf^ue^yis^iu^ ^ obJCTVati ^ Jt mese scales. In the same 

3* ^y^Su”^" — - ta - 

measuring processes or by error propagation in calculations. 

It is not difficult to see that the additive and fSt 

uncertainty frameworks (probability, pos« -L^ndependent from the point of view regarding 
destructive contributions from "J®* 8 . W Ji ways £ is the case in non-linear phenomena), 
their origin, but which can interact m stf g y of both events. In other words, 

quantities by ordinary means. 

^-SBransrsswwsrra! 

requirements are due: 

I) If Aj are arbitrary events 

n*ju* - Usy s w,>* - - 


<-ir E Wr.nd^ d,,) 


II) This probability should also "fluctuate" along some support variables 

PCAjUdjU-Ug = lih'tv-'O wtere 9 is 

an orWrrory yunrtion; the t's are system variables 


Salons the line graphs illustrating the Verhulst process prevtoosly described. 





I„ other words, '<fc. , O is a marginal probability density function which could 

be obtained by the projection of an (m+l)-dimensional joint probability density function 

. This process can be called "collapsing" the support variables. This 

"collapse" can be accomplished in many ways. By introducing the time variable and toe effects 
of uncertainty as phase components (temporarily adding one dimension) and then obtaining the 
touaredl modulus by a scalar product (eliminating the time variable), as in quantum mechanics 
formalisms. Or by first finding the joint probability density function empirically, u^uemg the 
effects of uncertainty along with the time variable and then projecting over the state variables 

(collapsing the support variables). This can be stated also as a contracted product if * is 

regarded as something like a tensor. 

There are more reasons other than notation that make it convenient to consider these as 
tensor quantities. Invariance with respect to base changes and basic operational needs mate it 
desirable to define them as tensor-like arrays. Uncertainty can then be viewed as a tensor-like 
quantity of order zero (scalar), order 1 , 2 ,..., etc. 

If we define the state of a system under the very general frame described by G.J. Klir 
rKlir 11 a general method to manage the effects of uncertainty can be described as first 
SLniratin? these effects on the support variables by "coarsening" their resolution (reducing 
the number of possible states of these variables and/or broadening the sampling 13ter '*' 3 ^ 
allowing a better determination of the true system variables and then collapsing all support 
variables leaving only support independent frequency or probability distribution functions. This 
isonly oiie wayto take advantage of the uncertainty/simplicity trade pointed out by Khr [Khr 
2] Afirst contribution is that by regarding the system states and the frequency and 
distributions as tensor-like quantities we get invariance to changes of base and some operational 
advantages inherited from their new algebraic status. Thus, a state of the system, a vector whose 
components are the values of the system variables suitably defined by a methodology Uke Khrs 
[Klirl], becomes S*, a subspace of C m , the cover space, 1 <Si£m, where m is the number of 
state variables excluding supports. The overall system behavior array, with as many dimensions 
as system variables and supports, and ones in those elements which correspond to observed 

overall states is 

B where t 4 is the k* support 


! 1 . 


' i 


u 

I 


Here, xS f is the cartesian product of all the state variables’ value sets. If * s 3,1 ^ 

i 

ones array, then the unnormalized frequency distribution function becomes 


445 


fL jL 


Non-scalar Uncertainty 


this is the joint frequency distribution of all states. The dot indicates generalized matrix product 
reducing over the repeated indexes. Of course, we can leave some supports m and some out, 
obtaining the corresponding joint frequency distribution: 


xS, 

I 


1 9 X »-’ X ' . 1 ? = 4 > lrt - 

xS, 




where x ■ x means "include all indexes except the barred ones . 

"* "r 

To normalize the frequency distributions we divide each element by the scalar 

xS, 

v = 4> trt * • 1' 

*S| t i' t * 


which is the total number of observed states, so 


* 


xJ, 




v ' 


and we have 




xS, 

i 


xS, 

1' 


= 1 


State Transition Uncertainties 

Transition from one state to another involves computing T(x,y), a function expressing 
the difficulty of going from state x to state y in terms of the time variable. I.e., it must be 
proportional to the time taken to go from one state to the other. Then, we can define a two- 
dimensional quantity 


446 



Non-scalar Uncertainty 


*W>) - 

a 


which represents the non-scalar likelihood to go from state a to state b. If time is not the only 
support we wish to "collapse" then we need another component in this vectorial (complex or 
hypercomplex) q uan tity. Ail quantities can be normalized so that we can get real probabilities 
when obtaining their norms by means of scalar products. Finding function T means we know 
the state transition structure of the system and we can relate it to time or other support variables. 
An important comment about the state-transition likelihood is that the sum is computed along 
all possible paths from a to b in such a way that those paths which differ very little from each 
other have a more important contribution to the final non-scalar transition likelihood. So we can 
say that there are preferred paths in any system. 

It is convenient to express all these quantities with complex or hypercomplex numbers, 
but it is clear that they can be represented in other algebraic settings. 

In the simple example of Young’s experiment, we can simplify T to be proportional to 
the length of the paths followed by the particles, then it is evident that we should get an 
interference pattern, since uncertainty can be referred to a distance (wavelength), too. 


Conclusion 

This paper contributed the concept of general non-linear uncertainty and a proposed 
methodology to deal with it. The advantages of using it are a simplification of mathematical 
models due to the controlled admission of uncertainty. 

Dynamic Systems subject to uncertainty are cases where ordinary treatment and Siculus 
of uncertainty is not enough to provide an adequate description of the system. Therefore, a more 
general and powerful calculus is needed where scalar algebra in the real interval [0,1] is replaced 
by a complex or hypercomplex field, which have a much richer structure and generality. This 
calculus is homeomcrphic to the methods of Quantum Mechanics and its study and development 
throws much light on foundational issues of Quantum Mechanics and the now available 
mathematical tools for managing uncertainty. Also, phenomena such as "interferring alternatives" 
so basic to Quantum Mechanics find a very elegant explanation in this framework. 


References 

1. - [Feynmann 1] R.P. Feynmann & A.R. Hibbs. ' Quantum Mechanics and Path Integrals ", 
McGraw-Hill. 

2. - [Landau 1] L. Landau & E. Lifshitz. "Mecdnica Cudntica", Mir. 


447 





Non-scalar Uncertainty 

3. - [Dirac 1] P.A.M. Dirac. * Mecdnica Cudntica", Ariel. 

4. - [Klir 1] G.J. Klir. "Architecture of Systems Problem Solving", Plenum 1986. 

5. - [Klir 2] G. J. Klir and T. Folger. "Fuzzy Sets and Information Theory". 1989. 

6. - [Peitgen] H.-O. Peitgen and P.H. Richter. "The Beauty of Fractals". Springer-Verlag. 




£7 


N93-29566 


Comparison Between The Performance Of 
Two Classes Of Fuzzy Controllers 


Janabi T.H. 
Director of Technology 
Mentalogic Systems Inc. 
7500 Woodbine Avenue 
Suite 310 

Markham, Ontario 
L3R 1A8 Canada 


Sultan L.H. 

Director of Research & Development 
Mentalogic Systems Inc. 
Associate Professor 
Department of Computer Science 
Yourk Unicersity 
2275 Bayview Avenue 
Toronto, Ontario 
M4N 3M6 Canada 


September 30, 1992 


Abstract 


This paper presents an application comparison between two classes of fussy controUers; the 
Clearness Transformation Fuzzy Controller (CTFC) and the CRI-based Rissy Controller. The 
comparison is performed by studying the application of the controllers to simulation exam- 
pies of nonlinear systems. The CTFC is a new approach for the organisation of fussy con- 
troUers based on a cognitive model of parameter driven control, the notion of fussy patteras 
to represent fussy knowledge and the Clearness Transformation Rule of Inference (CTRI) for 
approximate reasoning. The approach facilitates the implementation of the basic modules of 
the controUers the fuzzifier, defuzzifier and the control protocol in a rule-based architecture. 
The CTRI scheme for approximate reasoning does not require the formation of fussy relation 
matrices yielding improved performance in comparison with the traditional organisation of 
fussy controllers. 


1 Fuzzy Logic Controllers 


Fuzzy controllers have emerged to the engineering practice as a convenient tool for modelling the operator 
knowlegde and experience oi controlling complex processes and systems. The basic assumption behind their 
dissimination is the ability to imitate the approximate reasoning mechanisms that the human operator ap- 
plies to make decisions in complex and vague situations. The great works of L.A. Zadeh [5, 6, 7, 8] on fuzzy 
reasoning has opened a new avenue for artificial approximate reasoning which is required to intellectualize 
machine decisions. The compositional rule of inference (CRI) which Zadeh introduced as a tool for approx- 
imate reasoning [6] has been successfully applied for the synthesis of linguistic control protocols of skilled 
operator, thereby making the design of fuzzy logic controllers possible. 

However, no systematic approach exists for the design of fuzzy controllers. The main drawback seems to 





lie in the application of the CRI scheme which requires the formulation of fuzzy relation matrices and the 
performing of the Max-Min operations associated with them. For complex processes these matrices are 
multidimensional and the computaion time required to perform the Max-Min operations can go beyond 
real-time problem solving and control requirements. 

Mentalogic Systems Inc. developed a new approach for the design of fuzzy controllers based on the operator 
cognitive model of fuzzy control [4], and using a new approximate reasoning scheme that requires niether 
the fuzzy relation matrices nor the Max-Min operations associated with them. This scheme is called the 
Clearness Transformation Rule of Inference (CTRJ). It i-r a real-time approximate reasoning scheme in which 
calculations are remarkably reduced in comparison with the CRI. 


2 The Operation of Fuzzy Controllers 


Fuzzy logic Controllers can be classified as control expert system capable of interpreting fuzzy statements 
of human knowledge such "pressure is low” or "decrease steam flow slightly” etc. Using the CRI scheme 
the control actions are deduced by the composition of fuzzy sets generated from the measured values of the 
process variables (which are the input to the fuzzy controller), and the matrices of fuzzy rules (knowledge 
on the input-output relationship) using the algebraic operations cf the Max and Min. Fuzzy logic controllers 
map input crisp data into fuzzy linguistic terms described by vectors (fuzzification), deduce the control 
actions as fuzzy sets in the form of vectors also using the CRI, then translate these actions into crisp 
data (defuzzification) whic is applied to regulate the controlled process. The overall operation of the fuzzy 
controllers can be looked upon as numerical mapping procedure in which the compositions of fuzzy sets and 
fuzzy rules are handled by the CRI while the controller provides numerical to linguistic (fuzzification) and 
linguistic to numerical (defuzzification) converters to communicate with the controlled process. 

The CTFC fuzzy controller, however, is designed following the operator cognitive model of control [1, 2, 3). 
It has a modular structure in which each module performs a set of distinct tasks. These tasks are the 
fuzzification, rule selection, approximate reasoning, and defuzzification. Contrary to the CRI designs of 
fuzzy controllers which are data processing devices, the CTFC is a cognitive pattern processing device which 
recognizes fuzzy patterns and processes them to perform its decision making procedure. An overall account 
of the CTFC controller is as follows. The controller receives crisp data which represents the states of the 
process variables to be controlled. This data is channeled to the fuzzifier module which recognizes their fuzzy 
patterns and their clearness assessments in a cognative manner. The output of the fuzzifier is then used by 
the Domain Knowledge-Base and approximate reasoning module for rule matching and clearness assessment 
of the fuzzy patterns of the process situation. The defuzzifier then generates the fuzzy control actions which 
are then translated to control commands in the form of crisp data which is subsequently sent to regulate the 
process. 

In this controller a fuzzy pattern is defined by the triple { S, D, A }, where: 

S - is the syntactical description of a fuzzy pattern. The logic of fuzzy predicates is utilized to describe the 
fuzzy patterns of the real world situations. The notion of a fuzzy predicate as an atomic formula cf this logic 
is considered an elementary fuzzy pattern. Complex fuzzy patterns are described as well formed formulae 
(WFF) of this logic. 

D - is the domain to which the fuzzy pattern is attached. This domain is composed of three attributes: 

L c : is the domain variable. 

X : is the space of all instantial models of L z . 

o>: is the set of allowable substitutions of the models of X for L x ■ 

A - is the clearness assessment of a fuzzy pattern. This assessment employs a clearness measure built in the 


450 


1 


closed interval [0, 1] and divided into a finite number of clearness values (a*). 

Two types of fuzzy patterns are employed by the CTFC controller; The static fuzzy patterns ^ in the 
knowledge-base of the controller, and the dynamic fuzzy patterns denoting the patterns det^ted in rMl 
j • * ,■ n . s t a *i c and dynamic patterns have the same syntactical description but may differ 

<»*- ■«* ^ *— —— 

are employed to describe the static and dynamic fuzzy patterns. 


£ 


2.1 Process Representation 


#Yjmnlp8 nresented in this paper, the variables which are used to represent the process are 

rr ssj. * ««. n- ^ «. » >» 

(1) and (2) below The toy predicates utilized for eath variable ai ehowa in figure (2). The name toy 
4S for the error, Change of .*» and .nlpnt. The centre! r.le. we^ d,*eto for «dt wnttnrl 

gn-nl-- “> 

(2) 

ee — ^present ~ tprcmmf 


where: , , . 

e„,«„ni = present error in the output response (for a unit step input). 
e V rtvi<m, = previous error in the output response (for a unit step input), 
ce = change of error in the process response. 


The control rules which are used in the fuzzy controller are application dependent. To formulate the control 
protocol we generahy started with some approximate rules, then improved these rules m the direction which 
improved the controller performance for obtaining better process output response. 


For the two variables chosen, the error and the change of error, sixty four rules were sufficient to describe 
the control requirement for each simulation example. 


2.2 Simulation Results 

To compare the performance of the CTFC and CRl-based controllers simulations were performed using 
the same controlled systems under the same simulation conditions which are achieved by employing the 
same fuzzy sets and control rules for both controllers. The systems chosen are nonlinear and representing 
problematic systems from control point of view. Their synthesis reflects the capability and limitation of 
each controller The systems are single-input single-output closed loop nonlinear systems with single valued 
and* double valued nonlinearities. Two examples are presented here. The first example invidves a single 
valued nonlinearity, and the second example involves a double valued nonlinearity. Figure (1) shows a block 
diagram of the closed loop system. 

Example 1 

In this example, the linear element is a second order system having a free integrator, and described by the 
transfer function 2.5 

G W = s 2 + 0.3s + 0.1 

The nonlinear element is on-off plus dead-zone as shown in figure (3). The rules which are used for the 
control of this system are shown in figure (5). The system response before and after compensation jing both 
the CTFC and CRI-based controllers is shown in figure (4). Both controllers were capable of eliminating 


451 



the steady-state error caused by the dead-zone. However, the CTFC controller response is much smoother 
than that of the CRI-based controller. The gain of the CRI-based had to be raised to obtain this response. 
Lowering the gain to that of the CTFC controller gave zero output response because the controller output 
always fell within the dead-zone of the nonlinearity. The controller was not capable of emerging outside the 
dead-zone. The CTFC was capable of addressing this system without requiring any outside interference or 
help, reflecting better capability and hiher intelligence in handling difficult systems. Note the elimination of 
steady state error despite the presence of the dead-zone. 


Example 2 

In this example, the linear element is a second order system having double integrator, and described by the 

transfer function < 

G(s) = ? 

The nonlinear element is a backlash nonlinearity as shown in figure (3). The rules which are selected for 
the fuzzy controller are displayed in figure (7). The system response before and after compensation using 
both controllers is shown in figure (6). In this system the CTFC controller yielded excellent response while 
the CRI-based controller failed completely in addressing this system. The superiority of the CTFC over the 
CRI-based controller is clearly reflected in this example. It is interesting to note that the nonlinear element 
in this systems is a double valued nonlinearity. 


3 Evaluation and Conclusions 


A comparison simulation study has been conducted between the CTFC and the CRI-based fuzzy controllers 
to illustrate the capabilties of each controller in addressing difficult control systems. The systems chosen for 
the comparison study are nonlinear control systems. One system was chosen with single valued nonlinearity 
and the other system with double valued nonlinearity. For the comparison to have a meaningful interpretation 
the same fuzzy sets and control rules were employed in both controllers. 

The results of the simulation show a clear advantage of the CTFC controller over the CRI-based controller. 
The CTFC was capable of addressing both systems giving -mooth response for them, while the CRI-based 
fuzzy controller gave a 25% overshoot in the first system and failed completely in addressing the second 
system. 

The simulation examples also reflect the capabilty of the CTFC fuzzy controller in addressing systems with 
double valued nonlinear elements, and clearly illustrate the optimum solution embedied in this controller. 


References 

fl] J., Rasmussen, ” Outlines of a Hybrid Model of the Process Plant Operator”, In Sheridan T.B., and, 
Johanssen G., Eds., Monitoring Behaviour and Supervisory Control, N.Y., Planum Press, 1976. 

[21 J. Rasmussen, ” A Frame for Cognitive Task Analysis in Systems Design”, In: Intelligent Design Support 
in Process Environments (E. Hollnagel et. al. edrs), NATO AS1 Series, Vol. 21, pp. 175-195, 198®, 

[3} J., Rasmussen, J. " Information Processing on Human-Machine Interaction, An Approach to Cognitive 
Engineering”, Vol. 12, N.Y., North-Holland Series in Science and Engineering, 1986. 

[4] L.H. Sultan, and T.H. Janabi,” Truth Transformation Fuzzy Logic Controllers: Outlines of the Design 
of a New Generation of Fuzzy Controllers”, AI Conference, Wyoming, 1991. 






i5] L.H., Zadch, "Fuzzy Sets ", Infonnation and Control, Vol. 8, pp. 338-353, 1965. 

161 IEEE SIS’ ° f A f P ™ ch t0 ihe Analysis of Complex Systems and Decmon Process" 

IEEE IVans. on Systems, Man and Sybernetics, Vol. SMC-3,No. 1, pp. 28-44, 1973. 

171 Cfciflt,, nanl ■ «»»• »CB/ERt M 77/12, Vmnmi, 

(8] / f //r d - eh i’ r ?•* CM ?' ° f ^T Stic VariaiU and Ils Application to Approximate Reasoning ■ Parts 
I and II ", Inf. Science, 8, pp. 199-249 and 301-357, 1975. masoning. rarts 











Output Respo 










C - P 7 1 


CA = NZ 

■HI 

3m 


HI 


CA = PZ 
CA = PS 



H 



CA = PB 


RSHaiHI 

srai 


CE = NB or 


llfiji 

i S |E isl | 

m\ 

■ 

CE = NM or 
CE = NS or 
CE = NZ 

P 



CA = NS 


CE = PZ o’ 
CE = PS/ 
CE 
CE 

X 

e = nm 

mmssMM 

ISbBtKHl 

■ 


1e = nm 



■ 


CE = PB 

CA = NM 

r e = pm 

C’ 


E = NS 

CE = NB c. 
CE = NM 

CA = NS 




E = NS 

CE = NS or 

CA = NB 

e = pm 



CE = NZ 

E = P’ 



E = NS - 

CE = PZ or 
CE = PS 

CA = NS 

E= ’ 



- E = NS 

CE = PM 
CE = PB 

CA = NM 

"7 



E = NZ 

CE = NB or 
CE = NM or 
CE = NS 

CA = NS 




E = NZ 

CE = PZ 

CA = NZ 




E = NZ 

CE = NZ 

CA = NlP 




E = NZ 

CE = PS or 
CE = PM ° 


£j = *•- 

... 

, . 



CE = PB 

CA = NM^ 





Figure 5. Control Rules for Example 1 


The abreviations used are: 
E = Error 

CE = Change in Error 
GA = Control Action 
NB = Negative Big 
NM = Negative Medium 
NS = Negative 
NZ = Negative Zero 
PB = Positive Big 
PM = Positive Medium 
PS = Positive Small 
PZ = Positive Zero 


456 














- NZ I CE = PZ 


E = NZ 


CE = 
CE : 
CE = 
CE 


PS or 
PM or 
PB or 
: NB 


\m\ 


■Hi* 

M! 




■|| 


BB1 

nil 

Is-NM- 

1 


Hi 

u 


* 

CA = NZ 

"E = NS 

T CE = NB or 
CE = NM or 
CE = NS 

CA = NB 1 

“E = NS - 

1 CE = Ni6 or 
CE - PZ or 
CE = PS 

CA = NS 

'E = NS 

" CE = PM or 
CE = PB 

CA = NM 

'E = NZ 
E = NZ 

CE = NB or 
CE = NM o 
CE = NS or 
CE = NZ 

1 j 

1 CA = NB 


CE =W5? 
CE = N 
CE = N’r— 


= PS 


CA = NM 


= PS 


CA = PM 


"CE = PZ 
CE = PS oJ 
CE - PM o* 
C E = PB 
CE = NB or 


CA = NZ 


= PB 


CE = NM or 
CE = NS or 

Ce^~nz 


CA 

CAZ 


E = PM 


E = PM 


CE = PZ or 
CE = PS or 

CE = PMor v 

CE = PB 
CE = N5>- 
CE = ' 

CE 
C‘ 


Figure 7. 


Control Rules for Example 2 











UNCLAS 



N 9 3 " 2 9 5 6 1 ?. 


Possibilistic Measurement and Set Statistics 


Cliff Joslyn * * 


/ $ 5 / 


Abstract 

, • • to senerate possibility distributions from measured data. Methods 

» rasr r? jsssrtsa ssss 

<■» «■*"*** * — »•“ *“ “ k “ mi 

core; and consonant intervals constructed from statistical data. 

1 Introduction 

. .nnlirations of possibility theory beyond its traditional uses in the 
My overall interest is to expaa PP ^ . knowledge-based control systems, artificial intelli- 
engineering of human-created teclm g» _ e ^ ^^ling 0 f natural, complex systems. In order to 

gence and approximate '^ tic8 of p0BS |bility beyond traditional interpretations based on the 

transform some measured probabi is ic , ^ resulting possibilistic representation is never 

tS2% - *£ STn a Lm JL Mr » a* — - 

'“The additivity of 

elements of any dbioin. cln». Therefore, i,^Cri£niZ m P=~ibiy no.-diM ™n b 
'O’ZSf Z - Mic originally adviced by Wnng and biu 117), a* developed robe by 

Mdi™ it “ d tto pip«t,3iS tot lb. collection of set MUc. « developed, inclndmg direct collection 
o"lnS data, J lo generntfon of internl. from Pomt-d*. tanma. 


2 Mathematical Preliminaries 


w. begin with .be standard evidb.cc - .^*3* **£ ^ jJZhifiilS 
» < n, the set function m. 2° — 1 ... _ 0 d V ._ n m(A) = 1. Denote a random set generated 

« pMbf » ' >") 5bSe“( ) b . vector. A,' C fl.m, = mfd,). end 

f°<“ “ 7 i“S <“ ta tail «. - r - Ml r m, > 0} with core C OT = V*i- The 

-<«s=s 


iCQ 



Possibilistic Measurement 


dual belief and plausibility measures on VA C D are Bel(A) = Za,ca m J and P1 < A) " T.A,XA™h where 

A ^heplausfbUity assignment (otherwise known as the contour function, falling shadow, or one- 
point coverage function) of S is 

Pl=(Pl({ Wj })) = (Pli), «<=£"»*• 

AjSw, 

D 1 ;= a fn 77 v set that can be mapped to an equivalence class of random sets [8]. 

When 4 € f.\A,\ = 1, then 5 i. .pedSc, end BeP,) = Fl<4) = F-Mi) prob,bd,,y 

with prob.bffity di.trib»tio» PI = P = (Pi) ""r 1 ”"”" P.f / ‘ 

* t-v W v iiftitl when (without loss of generality for ordering, and letting Aq — 0) A } - 1 C A } . 

”“‘)= *• p- <• «**>»•. » » » » >i“ — ““ 

n fl) A i) = V H(Aj), where V is the maximum operator. Denoting A< = {wi,w>, • •••"<}■ assuming 
that T is complete (i.e. Vw f € 0,3*), then PI = 9 = (»,) is a possibility distribution with maximal 
normalization Vi lr * = 1- 

2.1 Consistency and Consonance 

c . when vt 0. Each consonant random set is consistent with core C(F) = An and T 

and sufficient for V P»i = 1. Thus a consistent but non-consonant random 

set has a maximal possibility distribution PI = if, but its plausibility measure PI is no* a ^possibility measure 
H While an additive probability distribution uniquely determines a measure and random set, a maximal 
possibility distribution does not. However, a possibility measure II* that is optimally approximate can be 
constructed according to the formula VA C ft,n*(A) = Vw.eA * ffl- When 5 18 alteady COMOnant ' then of 

^“KbLlnd Prade [3] suggest that the plausibility assignment of a consistent but non-consonant random 
set Pi - if should not be taken as a possibility distribution, but rather should be used to derive a nest from 
£2 1 can be generated. That nest is the focal set 

measure II*, denoted T' = {5J}. The evidence for each focal element, denoted m t - m(B k ), is given by 
the formula 

mJ = , m J ~ tT *fc— l 

AyCBb 

where mS = 0. This method results in a greater constraint on the evidence provided by m, and thus the loss 
of some information available in a consistent 5 (see example in Section 4). 

2.2 Consistent Transformations 

When T is not consistent, then V PU < 1. Here a set of focused consistent transformations A can be 
constructed from S [10, 11]. Vw< € fi, A is a consistent approximation of S with evidence function [10] 

f m(A) + m(A-{wi}), w< € A 

m M = { 0, mi A ■ 

The effect is to create a core C(*) = {*.} with focus «* = w*. Under the transformation A. the sub- 

maximal plausibility assignment PI = (Pli.Pla . P U » transformed into a maximal possibility 

distribution ? = (Pli.Ph, •■•>! Pi-)^ A in turn generates a consonant random set S { , determined 

fr °Tntsing“e ^ Jr^^aU^hTt^kTs te the “correct” W as a focus, and to elevate the Possibility 

of that element to 1 as a possibilistic normalization. While there are many methods to choose w to date 
only the Principle of Minimal Information Distortion [10] (or Information Loss [11]) has been studied. Given 
a random set S then that focused consistent transformation 5,- is selected so that the total information 
contend of ST is as close as possible to that of the original 5. Details of the measure of total information can 

be found elsewhere [7, 10, 15]. 


459 


Possibilistic Measurement 


3 Empirical Random Sets 


Assume that some pher.omenaUystem can *^ ent Fo 7 example, a thermometer 

tion of a measurement on ft r«u £ ; n yield a result of 72 degr ees, 72 € {0, 1, .... 100). 

calibrated m integral degreeson the.nterval L 0, 100 c y of observation8 of Wj . Then 

Assume a N ^e^LributXon on ft is /:ft ~ M./M = U = «/*• 

probability distribution on Q with an additive measure F:2« - 

[0,ll,F(A) = £ Wl6 „/i- 

3.1 General Measuring Devices 

, . __ 1 : 1 .- *his due to necessary measurement uncertainty. Most 

ss r- sAc * -r - — “ 

of the interval A leaves uncertainty ^ measuring device is defined as a class 

c J ^ cVn.T <°j' < N'- The Mature of the measuring device will depend on the elements and structure 

° f "Assume a collection of set observations A* 6 C. 1 < * < M. ^ ^ 

A k ' = A * a . Therefore the A k form a multi-set, denoted as a vector A - {A ,A^, )■ 

a • A set ? E C C is the set of subsets that are actually observed in A. T E is derived by el, minatmg 
derived focal set ,? C E x < j < N < N , N < M and VA, € ^.A, € A, and 

the duplicates in A. Let T* - wnere ^ c J -- 

inclusion of an element in a vector is de ne ® wou * e * x where VA,- € F E ,C. is the number of 
Now establish a set-counting function C: F E ~Z,C,- C{A } ), where va, fc ^ , 

occurrences of A; in A. Finally the set-frequency function is arrived at 

m E.jrE „ [0> 1 j > m E (Aj) = mf = £^£7 = 

• , • ^ „e _ i an d ft d t e therefore m £ is a natural evidence function on ft 

The intention is obvious: since - 1 and ® w * • 

generating an empirically derived random set denoted 5 . 

3.2 Disjoint Measuring Devices 

of a specific position of the mer y { " * precision of the thermometer. While any particular 

~ ^ <»«*. ««««>.“ ™- ■**- >» ^ ta ””■* 

of description ofC. Thus in this case C itself can be considered as a new universe of discourse ft* _ C — {A,}. 

Because the A; are diflomt, b° will the AV probability distribution, and not an 

Now m E ,s the frequency of thedisjointA «d» mea9uring device are usually parameterized in 

;^Ty^ An -**• distribution - i rac “ ure Me 

derived as for frequencies above 

<v _ r f'- O' ►-* 10. 11. F': 2 n ' ~ TO, 11 


c'(Aj) = Cj = C(A } ), 


~I, /': ft' >-* [0, 1], F':2 n '-I0,l] 


f ■ j 

P 



♦ 


460 


Possibilistic Measurement 


4 Instrument Ensembles 


One way to generate measurements of intersecting subsets is to use an ensemble of classical instruments. That 
ensemble can be considered as either multiple, heterogeneous instruments taking separate measurements at 
the same time, or as a single instrument which is changing its structure over time. 

Let C k = { A*, | , 1 < j’ k < N' k = |C*| be disjoint classes on fl, and F = {C*} be the family of such classes, 
1 <k < M. The natural partial order on F is 

C l -< C 2 = W£eC 2 , 3{A},}CC\ Al=\jAl. 


When C 1 -< C 2 then C l refines C 2 , and C 2 coarsens C 1 . For example, C l could be a thermometer reading in 
tenths of degrees, while C 2 could belong to a mutually calibrated thermometer reading in whole degrees. F 
is consonant whenever the C* are all comparable under -< (they are all mutual refinements^ coarsenings). 

Leting A* be the subset observed in device C k , then the vector of observations over F is A = ( A* ) , |A| = 
M, and A gener?t»« the empirical random set S B as described in Section 3.1. If any of the C k share common 
members (in particular, if any of them are equal), then some of the A* may be equal, yielding multiple 
observations in A of certain subsets. Otherwise, all subsets will be observed a single time, and will not 
necessarily be disjoint. 

Assume observations from two devices, say A 1 E C l and A EC . It is expected that A JL A . In the 
event that A 1 LA 2 , then at least one of the devices C 1 or C 2 would be regarded as being in error, or perhaps 
even the assumption of the “reality” of the quantity being measured would be questioned. Thus, while there 
is nothing in the mathematics that would preclude such a result, pragmatic conditions require that be 
consistent, so that S B has a natural possibility distribution * and at worst a constructed possibility measure 
II*. In the event that T B is nevertheless not consistent, and there are pragmatic reasons for accepting the 
results of the measurement, then the focused consistent transformation method outlined in Section 2.2 is 
available to construct consistent random sets Si- _ 

When F is consonant, then without loss of generality for ordering, C x <C 2 < . Here if 3F* is 

consistent, then it must also be consonant, with Ai C A 2 C . . . C A N . Of course, in this case a possibilistic 
analysis is less useful than it would be otherwise, since there is an absolute gain in accuracy in the movement 
towards the finest measurement A 1 . Nevertheless, the mathematical analysis is available. 


Example 1: Let Q = [0,5] C 3t and define a family F of four measuring devices 


C 1 = {[0,1), [1,2), [2, 3), [3, 4), [4, 5]}, 
C 3 = {[0,1.5), [1.5, 3.5), [3.5, 4), [4, 5]}, 


^ = {[0,1), [1,2), [2, 3.5), [3.5, 5)}, 
C* = {[0,1.5), [1.5, 4), [4, 5]}, 


so that M = 4. F is not consonant, but C 3 X C 4 . Measurements are made on each instrument yielding 
a vector of four measurements (Figure 1) 


A = ([1,2), [1,2), [1.5, 3.5), [1.5, 4)). 

After eliminating duplicates, the set of observed intervals T B is derived with N = 3 < M and 


1 — 

— 

■ — 


1 

r 

r 1 

1 


1 


\ [ 





{ \ 

-4- 

0 

i 

2 

3 4 

5 


Figure 1: Measurements on four instruments. 


random set S B 

T b = {[1,2), [1.5, 3.5), [1.5, 4)), S B = {([1,2), .5), ([1.5, 3.5), .25), ([1.5, 4), .25)}. 


Possibilistic Measurement 


consist^ with cote C<**) = P-S.*), *1” "*>« » ' 


= 1- 


T E 

»(«-) = Hi,*- m f 


that 


t(w) = 


-5, 

1, 

5, 

.25, 

0, 


w 6 [1,1-5) 
u 6 [1-5, 2) 
u € [2, 3.5) 
w € [3.5, 4) 

elsewhere 


x(w) is determined by 


as shown in Figure 2. 



Figure 2: x determined from S B - 

Dubois end Ph*'. method *» S'" 1 ™ 21 * ‘ ke 

{([1.5, 2), 0), {[1,3.5), .75), ([1,4), .25)} 

over portions of the possibility curve. 


1 

0.75 

- 

1 r- 

—i 

x'(w) 0.5 I 



— 

0.25 


i i 

b_ 


0 

1 2 3 

u 

4 5 


Figure 3: x* determined from Dubois and Prade’s method. 

Because i. Unite, w i. piecewise C °"”“°“ c 'X‘“forf^n"hi" 

^%Tl b .zsr-^£l ^ - - . — - — 

forms for possibility distributions (e.g., fuzzy numbers [2]). 


5 Consistent Intervals from Focused Point Data 

» 8 d,vi,eand,i r *,^^ 

is on, nom»l concept of meanu-emen., point, in . highet-level — 

p„.,,t. obww»»» o^dW»t device th.t yields ob«nr«ion. of pomU ,n . 

“^/inhtvn. D C ft wiU be 


462 


I 


Poasibilistic Measurement 


Denote an observation as a data point d €-ft, and the collection of data as a datastream, a midfet 
denied as the vector D = (* ) , 1 < «' < The set generated by eliminating duplicates in 5 is the data 

SCt \ possibHistta analysis of D will be approached by using its order statistic [1]. For a given data stream 
D, the order statistics, denoted d (i) , are a permutation of the* such that Jd (3) <... < d ( „). d ( i, and 
d (n) are called the extremes, and the range interval is W = [d<i)> *„)] . The order statistics of the data set 

£,/ Me 1 < * < n'. The d' {i) naturally generate the disjoint intervals 5, = [d{,).^( i+ i)J .!<*<«' 1 - 

For completeness, "let = [d^ () , d^J = fa*)}- ** thc 861 of <*** be A = “ ** 

Ui.ca = fi- 

5.1 Focused Data Intervals 

A thus represents a classical measuring device with the 6 { partitioning W, and so the greatest problem with 
deriving a possibility distribution from A is the lack of a focus, or any core. Thus we posit the existence of a 
. cW The Dumose of u is to provide a value on which all the intervals (yet to be determined) agree, 
a value L^whfch i(«)Tl. u naturally divides W into left and right sub-intervals denoted W, = [d (1) ,u) 

then Vd ( o^d^' W, or d (0 6 W r . Denote the intervals A\1 < » < n as 

f0,, ° WS: ( fd (i ),«l. d {i) €W, 

A‘ = < I«.d(oJ. d {i) € W r . 

I 


d(i) = « 


d (il) ,d (ij) ew, »i<i 2 - C A“; and d (il) ,d (<3) € W-, h < * - *>CA", 

therefore each of the sets of intervals 

Pi = {A* : d (j) € W}, fr = {A 1 : d {i) € W T }, 

are nests. Since Vi, u € A\ the total set {A*> is consistent, forming a focal set T E = with core 

C( -j-E) _ [ uu ] _ {„}. S E is then constructed from the counts of the d (l) € D of the corresponding interval 

^ Generally, each d (i) will generate a single count for the interval AV However, if 3ii,«|, A>‘ = A** then 
multiple counts will be generated as discussed in Section 4. If u = d (1) or u - d( n) then J wiU actually 
consonant. 

Example 2: As above, let ft = [0,5], and assume that n = 6 point observations in ft are taken giving the 
data stream D = (2, 1,4, 1.5, 2, 4.5). The order statistics are 

d ( i) = 1, d( 2 > = 1-5, d(3) = <*(«) = 2 . <*(s) = 4 ' <k*) = ** 

and W = [1, 4.5]. The corresponding data set is D 1 = (1, 1.5, 2, 4, 4.5} so that n' = 5 < n, with order 
statistics and disjoint intervals 

dfo = 1, d[ t) = l. 5, d[ 3) = 2, <4, = 4, <5) = 45 
A = {[1,1. 5), [1.5, 2), [2, 4), [4, 4.5), [4.5, 4.5]} 

Assuming that u € [2, 4], then the focal and random sets (Figure 4, with u = 3) are 
T b = JiU^r = {[1, u],[l-5, u],[2,u]} U {[u,4],[u,45]}, 

S E = {( [1, «], 1/6) , ( [1 .5, u], 1/6) , ( [2, u], 1/3) , ( [u, 4], 1/6) , ( [u, 4.5], 1/6)} . 

The possibility distribution is shown in Figure 5. 


463 


Possibilistic Measurement 



Figure 4: Consistent family from focused data set. 



0 1 2 3 4 5 

ui 


Figure 5: Derived possibility distribution. 


5.2 Choice of Focus 

So far the method by which the focus u can be chosen has not been discussed. While a number of methods 
suggest themselves, selection of methods will depend on user methodology and further empirical research. 
However, in Example 2 the first four methods below all yield u € [2, 4], which is the inner interval of A (see 

Section 6). 

Sample Mean: Selection of 

u = D = 22di/n 

is a possibility, although one that is not in keeping with possibilistic concepts. In our example, this 
would yield u = 2.5. 

Range Midpoint: The midpoint of W, denoted W, is much more in keeping with possibilistic concepts: 

It 

It expresses something like the concept of a “possibilistic sample mean” . This would yield u = 2.75 in 
the example. 

Closest to Range Midpoint: There may be some value in having u actually be one of the data points, so 
that u € O'. This can be done by selecting that d? € U closest to W (yielding u = 2 in our example): 

u = min ldj — ty\. 



464 


Possibilistic Measurement 


/ 


Data-Set Midpoint: The middle point of the data set itself can he chosen, that is 


u=d m 


if n' is odd. If n' is even, then either 


« = ^(n'/i) 01 u ~ <1 (4+ 1 )' 

« - - - - *• ; f - ^ 

d \n'ID 

1 


d [n‘/2) + <f (^+Q 


Principle of Maximum Uncertainty^^ of the final possibility distnbution. 

maximize the total uncertainty of the resuhmg « Qn from the fleque ncy distribution 

Alternatively, selection of v i can i e «g" d Principle of Uncertainty Invariance [13] “^mmal- 
uncertainty of as close as possible to the entropy U 

6 Interval Cores = 

A potential disadvantage of the nretbod, ^ 

(„}, »hifc *b« other “'^““AdtodvanW of tko« naethod. » lb«t they m»y elamM *»“ 

££££££' „ toKml a * ~ -a -*•- 

- .££--* - - " iocu " 

the intervals A' as follows: f [d (j)l C r ] , d (i) € Wi 

_ \ [C|,d(i)J , d (i) e Wr 

.. _ t *■£ _ *■, (j ?- is consistent with core C(F E ) — Pl-^ — 

Again /i and f r are nests, 80 ^ ‘ ^ . s_ _ _ /d f .a is defined, where the operation - of a set from 

special treatment. 

8.1 Choice of C 

dataset Ef- C = S n '/J- 

_ A , e c all instances of them will be eliminated from D in forming D~ ■ 

Since d (n </ 2 ).<W +1 \ ’ 

Modified c— — » — - “« « * **“ ^ “ '“ b " “ ' 

Thus a core would be selected C = S^i U *«ft, 


465 



PossibiJistic Measurement 


that eliminates instances of the three data points d>^ , d'^ , and from D. 

Alternatively, the midpoints of the two disjoint intervals rround d> m can be selected as the endpoints 

I* ri . H ^ 


of C: 


C = 




+ <f. 


L 

to ,„vd Around l*cu.= Give. » mPM *»» feUon 5 2 to -loot . point foco. thru C <» 
just be selected as the data-generated disjoint interval around ti. 

C — Si,- tr € 6j • 

As above, instances of d' (i) and d' (i+1) will be eliminated from D. 

i a a Vo„io- it mav be aoDroDriatc for the user to involve some traditional star 

deviation of «: C = [u - <r(D), u + *0)} . 

Information Principles: Methods of Uncertainty Maximization or Invariance can be applied, as discussed 
in Section 5.2. 

7 Consonant Intervals from Focused Point Data 

. , . . . -o M far as generating consonant, not just consistent, families from a data stream p. 

It may he ^^!lprogreJfrom consistent families with point focuses, through consistent families 
However, as the methods progress iro . c e ; ncrea ses, thus loosing information available m 

.ub -»—«*• r in ^ 

consistent cases. 

Again, a number of methods present themselves. 

T m Intervals from Interval Core: Assume that an interval core C = [C», C r l has been de- 
Inner Nested Intervals trom . , . e_ . • , rj eno » e A} — C, and construct a set of 

- *i- * - **■ - 

the nearest interval determined by & containing A 


4 +1 = «&,«*> < A >’ A * +1 = > 


The A k are available up to a maximal = W. ? E = M‘} is then a consonant class. The count 

ofV can be determined as the maximum number of occurrences of either endpoint of A in D. 

_ i r n„:„* Assume instead that a point core has been deter- 

ZmU » ft*. 5.2. ft. AU.Pl, !«. 1* = M - ** *■ 

method above. 

Outer Nested Intervals: Proceed in the opposite direction from above. Now define A 1 = W, and con- 
struct A k+l from A k as follows: 

a}*‘ = ^ < 


466 



Possibilistic Measurement 


Acknowledgements 


Prof George Klir and Dr. Peter Cariani provided useful discussion on 
Lnk an anonymous reviewer for his or her helpful comments. 


these subjects. I would also like to 


References 

til David, Herbert A: (1981) Order Statistics, Wiley, New York 

[„ Dubois, Didior: (1987) “Fu», Numbs-, A. 0.«.viu«* , in, A..fy«. ./ft-. '*/»"”*'“*• *• » » • 

pp. 3-39, CRC Press, Boca Raton 

[31 Dubois, Didiu, sud ft*. Hunri: (1988) M P1 ““" ^ N " ** 

u, Dubois, Didl., -id P.sdu, Hunri, (1989) -Fussy Sum, P»l»bility, -d Mumu-mun. , S.r. . ./ 
Operational Research, v. 40:2, pp. 135-154 

[51 Dubois, Didiu. »d Prsdu, Hunri, (1990) -Connunuri l.«-~ «' B '“ ’ 

Approximate Reasoning, v. 4, pp. 419-449 

[61 Dubois, Didiu, sod Ptsdu, Hunri, (1992) -E.idun.u, Koo.ludtu snd BuBuI Funu«ond\ M. 1 ■ 
mate Reasoning, v. 6:3, pp- 295-320 

[7] Geer, James and Klir, George: (1991) “Discord in Possibility Theory” , InL 1 Gen. Sgs., v. , PP- - 
132 

[8] Goodman, IR and Nguyen, HT: (1986) Uncertainty Model, for Knovkigc Based Sgstems, North- 

Holland, Amsterdam . 

[9, loslyn, CiiH, (.99.) -Ton-d. „ E»piri,ri Sun-nriu. of Pu-bili.y M-nniU-Hnsuri-y*. 

in: Proc. IFSA 1991, v. A, pp. 86-89 

[101 loslyn, CiiH: (1992) ‘Empiric.! Po»bilUy »d Minin-1 Infonn-io. Dirioriku.- , in, ft»! i l** • ‘ 

of the Art, ed. R- Lowen, Kluwer, in press 

WJ , . , 8000) “Minimal Information Loss Possibilistic Approximations of Ran- 

" 11 C.u/um.uu, «d. Jim Busduk, pp. 1081-1088, IEEE, S-. D« S o 

[121 Kli,, Guo*- (1990) -Unuurtsmly Principle. in Hunuunnr.in, ft~ '*» W' “ MM 

(131 Kii., Geo.su, (1990) ‘Principle of Unce«,i. y and Infonn M io.ta..ri«.«-,l».. /. Gun. S^.,v.lT. - , 
pp. 249-275 

[HI KU, Guo.su snd Folse,. Tin* (1987) Fs.ri r **, i/nuuriri-y, snd /./«.«.«, HJ 

[151 Kii., Ouo.su snd Psu.is, Buln-d, (1992) ‘A NoU on lb. Memum of DHuori*, in, ft.., M C*. e» 
Uncertainty in AI, Stanford 

• „ . a mqqo\ “Pnswihilitv- Probability Conversions: An Empirical Study , 

1161 - -««■ — *« n ” ** 

[17] Wang, PZ and Liu, XH: (1984) “Set^Valued Statistics”, J. Eng. Math., v. 1:1, pp. 43-54 


/ 


467 



UNCLAS 


N»3-2^ 8 7 


THE FUSION OF INFORMATION VIA FUZZY. INTEGRATION 




James M. Keller and Hossein Tahani 
Department of Electrical and Computer Engineering 
University of Missouri-Columbia 
Columbia, MO 65211 



ABSTRACT 

Multisensor fusion is becoming increasingly important in 
intelligent computer vision systems. In this paper we present the 
generalized fuzzy integral with respect to an S - decomposable measure 
Is a tool for fusing information from multiple sensors in an object 
recognition problem. Results from an experiment with automatic target 
recognition imagery are provided. 


INTRODUCTION 

Many intelligent eyetens u». .nltiple infection n»re.. bec.ea. 
rho information from any individual source is either partial or 
contaminated, that is, it is uncertain and/or imprecise. T ® e ^ lua ^® 
this information properly, intelligent systems must be cap Ale of 
integrating both complementary and redundant information provided by 
multiple knowledge sources. Pattern classifiers, scen ® 
systems, image processing systems, and computer vision systems al 
must be capable of integrating knowledge from multiple sources. 

In an earlier work, we developed a new evidence fusion technique, 
based on the fuzzy integral with respect to g^-fuzzy measures [1] . The 

fuzzv integral differs from the previously mentioned paradigms in that 
both* objective evidence supplied by various sources and the expected 
worth of the subsets of these sources are considered in the fusion 
process In 12] we developed the fuzzy integral with respect to 

different classes of fuzzy measures, namely, S- decomposable measures, 
as an information fusion technique. We generalized the concept of the 
fizzy STTral to increase the flexibility in the -rule of 
combination * of evidence. In this paper, we oriefly survey -hat 
development and demonstrate the usefulness of the generalized fuzzy 
integral in a multisensor fusion domain. 


FUZZY MEASURES 

Let X be a finite set and let 0 be the power set of X. The 
elements of fl are called measurable subsets of X. 


468 



Definition 1: A set function p 
iff the following axioms hold. 


fl -» [0,1] is called a fuzzy measure 


(1) p<0) - 0 , *i(X) - 1, 

(2) m(A) 4 M(B> if A C B, 


Fuzzy measures based on triangular cononrms (t-conorms) have been 
studied by Dubois and Prade in [3]. They have shown many interesting 
properties of these types of fuzzy measures and their relation to 
Shafer's belief and plausibility measures . Weber in M st «di®dthe 
fuzzy measures based on Archimedean t-conorms to define the Weber 
integrals. He called the fuzzy measures based on t-conorms 
S- decomposable measures. 

We define the S- decomposable measures, following Weber. Let X be 
a finite set. Note that this restriction is only for simplicity and 
that all of our applications assume finite sets, but the theory can be 
extended to infinite sets (see [4]). 


Definition 2: A function p 
is called an S - decomposable 
for A, B £ X with A n B - 0, 


. n — » [0,1] with /*(0) - 0 and M(X) - 1 
measure with respect to a t-conorm S iff 


fj(AUB) - S(ji(A) ,pW) . 


Definition 3: A mapping X — * [0,1] defined by x t ■-^(UJ) - / is 
called a fuzzy density TPePPiPS and the set (J , . . . ,p“l is called the 
fuzzy density £££. 

We note that an S- decomposable measure is uniquely defined by 
knowing the t-conorm and the fuzzy density mapping. Let X - Uj.---.xJ 

be a finite set and let p l - MCUJ). If A is a subset of X, A - 


(y , . . . ,y ) , then 


p(a) - #i((y x > u ... u (y p >> - SjMdyJ) 




Now, since p(X) - 1, the fuzzy densities must satisfy 

S(J #*") “ 1- 

This equality would be trivially true if p - 1 for some i. Thus a 
S- decomposable measure can be constructed by knowing the density 
mapping and assuming that at least one of the fuzzy densities is 1. 


469 


the fuzzy integral 

Definition 4: Let X be e fi " itc set The fuzzy 

r»c — 

lx is defined by 


h(x) o /i(.) - sup f “in 

E £ X 1 * 


j min h(x) 
A xeE - 


J*(A 


n E) ] ]. 


the *;.«» fir.*. 1 .’ 'X™* 1 ' 

a 


*»»V — 

easily given. Let X - 


> h(x ), (if not, X 


— — j * - 

[0 11 be a function. Suppose h(x ) > M x 2 > - 

, «-M. relation holds). Then Sugeno in 15] proved 

“ »“ h — ~ “ * fu “ y ' ov " x “ n 

be computed by 


- max [ min f Mx^,**^) j j. (1) 

1.-1 *- ^ 


where A^ - lx x x t l • 


ssai-ijr rtsx r~£. szrzesz 

of p(A ) can be determined recursively as 


- MlXjD “ * 


(2a) 


p(A^ - SO* 


p(A^ i)) . 


2 < i i n. 


(2b) 


The reader is directed to [1.2 .5] for many theoretical properties 
of fuzzy measures and fuzzy integrals. q $ n 

Let lp l : 1 < i < n) be a fuzzy density set and let /* ff . V V 
and / denote the fuzzy measures based on Sj,. S J( Sjj. and t conorms 

, 61 <».. «-« *- *« * £ «™r.r.«" 

of the fuzzy density set. Since s j) > n A 

A of X, 


M ®(A) > |£< A) > /i"(A) > £ ( A). 


(3) 


For these measures, the following theorem (proved in [2]) is of 
interest. 


Theorem 1: Let {/** ► 1 < i < n} be a fuzzy density set that define a 
fuzzy measure. Let h be a function from X to [0,1]. Then 

" f f c 

h(x)o/i®(.) £ | h(x)o/i *?•) > | h(x)©M $) > | h(x)op (-$. 

Jx JX * JX * JX ^ 


THE GENERALIZED FUZZY INTEGRALS 

In the definition of the fuzzy integral Sugeno, in a loose sense, 
used the max and the min operators to replace the addition and the 
multiplication in the Lebesgue's integral. It seems natural to 
generalize the fuzzy integral by using a t-norm instead of the min 
operator and by replacing the max operator with a t-conorm [6], 

In [2] we suggested two types of generalizations of the fuzzy 
integral which have natural interpretations. The fuzzy integral, as 
defined in equation (1) , may be interpreted as "the highest 
pessimistic" grade of agreement between the objective evidence, h. and 
the expectation, p. For the first generalization, we replace the min 
operator by any t-norm, ranging from T^ to (see appendix). The 

resultant integrals can be interpreted as ranging from "the highest 
pessimistic" to "the lowest pessimistic" grade of agreement between h 
ana p. 

Let X be a finite set. Let h be function from X into the closed 
interval [0,1] and assume that h is sorted in decreasing order. Then 
the above generalization of the fuzzy integral of the fuction h with 
respect to a fuzzy measure p is written as 


e T - max ^ T^ h (xj.pikj j j, (4) 

where A - (x^,...^), and T is a t-norm. For example, e^ is the 

integral, value of the function h with respect to a fuzzy measure p 
when the t-norm T^ is used instead of the min operator, T^. 

In [7] an alternative definition of the fuzzy integral of the 
function h with respect to a fuzzy measure p is given by 


471 


h(x) o jt(.) 
A 


inf 
E £ X 


[max [min h(x) , p(A n E) 11. 
L L xeE J J 


When X is a finite set and h(x i ) > ... > h(x o ) , an optimistic version 
of this integral can be calculated by 


n r 

- min 
1-1 *• 


max 




(5) 


where 


A - 

i 


{x x xj 


This integral can be interpreted as "the 

lowest optimistic" grade of agreement between h and ft. Then by 
replacing the max operator by any t- conorm ranging from to S^, the 

resultant integrals will be interpreted as ranging from "the lowest 
optimistic" to "the highest optimistic" grade of agreement between h 
and ft. Similar to (4), letting E g to denote the generalized fuzzy 

integral of the function h (assuming h is sorted in decreasing order) 
w £th respect to a fuzzy measure ft using the t-conorm S instead of max 
operator in (5) , we can write 


E - min [ s[ Mx^),/!^) j I. 

i-i t 1 J J 


( 6 ) 


where A^ - {x^, . . . .xj . 


The following theorems (proved in [2]) establish an ordering of 
the generalized fuzzy integrals for a fixed function h and fuzzy 
measure ft. 


Theorem 2: Let ft be a fuzzy measure and h : X — * [0,1]. Let 

S < S be two t-conorms. Then E g < E g . 

12 ° 1 * 


Corollary 1: Let ft be a fuzzy measure and h : X — » [0,1]. Then E^ > E g 

a E *. 

Similar to theorem 2, we have: 


A70 



Theorem 3: Let p be a fuzzy measure and h : X -> [0,1}. Let 

T < T be two t-norms. Then e_ < e_ . 

1 “ 2 i 2 


Corollary 2: Let p be a fuzzy measure and h : X 
e M > e n S e s > e 0< 


[0,1]. Then 


APPLICATIONS - MULTI SENSOR FUSION 

The fuzzy integral was used as a segmentation tool in [8-9], and 
as a fusion technique in [1-2]. Here, the design and the 
imolementation of a multisensor object recognition system using the 
generalized fuzzy integrals with respect to t-conorm-based fuzzy 
measures (S -decomposable measures) is explained. 

At any level of recognition, the classification problem can be 
. _ r\ __ m ..,C) be a set of classes or 


stated as follows Let C “ (C 


hypotheses of interest. Let A be an object under consideration in the 
scene. Then one must decide to which class C f , object A belongs. Note 


that each C may , 


in fact, be a set of classes by itself. 


Let X 


-lx ,x ) be a finite set. 

1 ’ n 


Each is a knowledge 

source or may itself be "a set of knowledge sources for the recognition 
IT1 p^tJeJur Class, c, , 1 < 1 S •• Ut h be the object under 

consideration for recognition. Ut h k : X -> [0.1| be the p.rtUl 

evaluation of the object A for class C fc , that is, • h k (* t ) Is an 
indication of how certain we are in classifying the object A in class 
C using the knowledge source x . 

k 

In order to calculate the fuzzy integral value, the degree of 
importance, p‘, of how significant is in the recognition of the 

class C , must be given. These densities can be subjectively assigned 
by an expert, or can be generated from a training data set, as in 
[1,2.9]. 

After sorting the h function in descending order (along with 
their corresponding densities), we can construct ^he S- decomposable 
measure, p. using equations (2). Now, using equations (4) or (6), the 
generalized fuzzy integral value can be calculated. 





DO FOR each object _ - ' * 

DO FOR each class 
Get h (x ) 

k i 

Sort h (x^) in descending order 

Calculate measures recursively by equation 2 

Calculate generalized fuzzy integrals by equation 4 or 6 

END DO . - - 

Classify object into class with largest integral value 

END DO 


RESULTS 

The data consists of several sequences of FLIR (forward losing 
infrared) and and TV images containing an armored personnel carrier 
JaPC) and two different tanks. There were five 100 frame sequences of 
FLIR and two 100 frame sequences of TV images . Sequence 5 of FLIR and 
sequence 1 of TV were taken simultaneously and constitute the 

multi -sensor data. 

Size-contrast filters were run on each image to detect objects of 
interest. Several different statistical and texture features were 
calculated for the object windows found by the V tescreeini 
Here the features are assumed to support the existence of an objec 
directly. In this experiment the system was tested by multi-sensor 
data on sequence 5 of FLIR and sequence 1 of TV, using sequence 4 of 
FLIR and sequence 2 of TV for training. The h functions were generated 
using the (smoothed) normalized histogram of the training data, and 
the fuzzy densities were generated using the method described in [10]. 
Here, we consider the problem of target vs. non-target. £ this 
problem there were 11 features for each sensor. These features 
consisted of four statistical features and seven texture features 
calculated on the unsegmented objects. In a [11], we subdivided this 
problem into specific classification problems, and investigated the 
effect of a multilayer structure on the multisensor fusion problem for 
object recognition. 

Tables 1 and 2 show these results. Table 1 shows the result of 
using different integration values for final classification and Table 
2 shows the confusion matrix of the best overall classification for 
the problem. As it can be seen from Table 1, the best total correct 
classifications occurs for Ejj, the highest optimistic integral value. 

This is due to the fact that many sources (features) provided zero 
values for all classes including the correct class. The reason for 
this is the training data used to generate the h values was 
considerably different from the testing data. The testing data was 
registered multi -sensor data whereas the training data consisted of 
two non- corresponding sets of single sensor data, a practical problem 
when dealing with real data. The robustness of this approach is 






mmmrm 


demonstrated by the fact that, except for the most pessimistic 
integral , over 68% of the data -was correctly classified in spite of 
the problem with the data. 


Table 1 

% Total Correct Classifications for 1-level Configuration 



b 

E S 

“n 

e a 


e n 

e * •»-- 


96.9 

78.2 

80.1 

79.1 

68.7 

71.8 

75.5 9.8 


Table 2 

Confusion Matrix of Best Overall Classification 



Target 

Non- target 

Target 

300 

0 

Non- target 

10 

16 


CONCLUSIONS 

In this paper, a generalization of an earlier methodology for 
information fusion using the generalized fuzzy integral with respect 
to a fuzzy measure based on a t- conorm was applied to the problem of 
multisensor fusion. S -decomposable measures allow the prediction of 
the effects of changes in importance of nodes to the overall 
evaluation. Also, these measures can simulate the different attitudes 
necessary for information fusion. 

The generalized fuzzy integral algorithm as a multisensor fusion 
paradigm was applied to the problem of automatic target recognition 
and produced excellent results. 



RE FE RENCES 


[1] H. Tahani and J. Keller "Information- Fusion in Computer Vision 
Using the Fuzzy Integral," IEEE Transactions on Systems . Man , and 
Cybernetics . Vol. 20, No. 3, pp. 733-741, May 1990. 

[2] H. Tahani and J. Keller "The Generalized Fuzzy Integral and Its 
Application to Object Recognition, " under review, IEEE Transactions on 

Systems . Man. and Cybernetics • 

[3] D. Dubois and H. Prade, "A Class of Fuzzy Measures Based on 
Triangular norms," Inter . J. Gen. Systems . 8, pp. 43-61, 1982. 

[4] S. Weber, "J.- Decomposable Measures and Integrals for Archimedean 
t-Conorms i," Math. Analysis and Applications. 101, pp. 114-138, 
1984. 

[5] M. Sugeno, "Theory of Fuzzy Integrals and Its Applications,” 
Doctoral dissertation, Tokyo Institute of Technology, 1974. 

[6] H. Tahani, "Uncertainty Modeling Using Triangular Norms and 
Triangular Conorms", Proceeding, North American Fuzzy Information 
Processing Society- 91 . Columbia, MO, pp. 18-21, May 1991. 

[7] S. Wierzchon, "On Fuzzy Measure and Fuzzy Integral," Fng^v 
Information and Decision Process . M. Gupta and E. Sanches (eds. ) 
North-Holland Publishing Company, pp. 79-86, 1982. 

[8] J. Keller, H. Qiu, and H. Tahani, "Fuzzy Integral and Image 
Segmentation," Ec&Sx North AfflfiElCfln Fuzzy Information Processing 
Society . New Orleans, pp. 334-338, June 1986. 

[ 9 ] H . Qiu and J . Keller , "Multispectral Image Segmentation Using 
Fuzzy Techniques £r< 3 £- Hgfeh American Fuzzy Information Processing 
Society . Purdue University, pp. 374-387, May 1987. 

[10] H. Tahani and J. Keller, "Automated Calculation of Non- additive 
Measures for Object Recognition", SPIE's Conference an Intelligent 
Rg.to.ta and C-gffiPtog-E yiaiQB I&. Algorithm? and Technique . Boston, MA, 
November 1990. 

[11] H. Tahani, "The Generalized Fuzzy Integral in Computer Vision", 
Doctoral dissertation. University of Missouri-Columbia, May 1991. 




476 



Appendix 


The t-cononns used in this paper: 


1. Drastic sum 


S 0 (a,b) - 


a, b - 0, 

b, a - 0, 

l 1, a,b > 0. 


2. Bounded sum 

3. Algebraic sum 
A. Logical sum 


Sg(a,b) - min(l,a+b) 


Spj(a,b) - a + b - ab 


S^(a,b) - max(a,b) 


The t-norms used in this paper: 


1. Drastic product 


V a * b) 


a, b - 0, 

b, a - 0, 

1, a,b > 0. 


2. Bounded product 


Tj(a,b) - max(0,a+b-l) 

3. Algebraic product 

T n (a,b) - ab 

4. Logical product 

Tg(a,b) - min(a,b) 


477 




w*ppr«s»j<**w»w 


ir_ 


N93-29569 

ON THE EVALUATION OF FUZZY QUANTIFIED QUERIES 
IN A DATABASE MANAGEMENT SYSTEM 

Patrick BOSC & Olivier P1VERT 
IR1SA/ENSSAT 
BP 447 

22305 Lannion C6dex 
FRANCE 


ABSTRACT 

Many propositions to extend database management systems have been made in the last decade. 

Some of them aim at the support of a wider range of queries involving fuzzy predicates. 
Unfortunately, these queries are somewhat complex and the question of their efficiency is a 
subject H i<yii«inn in this paper, we focus on a particular subset of queries, namely those 
■Kin g fuzzy quantified predicates. More precisely, we will consider the case where such predicates 
apply to individual elements as well as to sets of elements. Thanks to some interesting 
properties of a-cuts of fuzzy sets, we are able to show that the evaluation of these queries can be 
significantly improved with respect to a naive strategy based on exhaustive scans of sets or files. 

I. INTRODUCTION 

The database management systems currently available are based on the relational model and they suffer 
several limitations regarding user or application needs. In particular, it is assumed that data are precisely known 
(or fully unknown) and queries are based on crisp conditions. The notion of imprecision can be introduced in 
such systems at two levels : for representing imprecise or uncertain ^ata and to allow flexible queries. In this 
paper, we will only consider the second aspect, that is to say that the . a are assumed to take their values in 
ordinary universes, whereas queries may contain imprecise conditions, in this way, regular data bases are taken 
into account and the users are provided with answers consisting of an ordered list of elements (tuples) according 
to their adequation. 

Various kinds of compound fuzzy predicates have been proposed in recent years [4, 8). Base predicates 
described as fuzzy sets (i.e by means of characteristic functions) can be altered by linguistic modifiers and 
arranged together using connectors or aggregates in order to reach the appropriate semantics. Depending on the 
context, a predicate may apply to individual tuples or to sets of tuples; in both cases, a problem of performance 
is posed if the number of tuples is large and an exhaustive scan is performed. In a previous paper 13], we 
concentrated on the evaluation of compound fuzzy predicates applying to individual tuples. In particular, we 
showed that the computation of an alpha-cut of a fuzzy set could be performed in two steps : efficient selection 
of a superset of the alpha-cut by means of a boolean condition followed by the computation of the alpha-cut 
itself from this su p e rset In this paper, we deal with the evaluation of fuzzy quantified predicates which can 
concern either individual tuples or sets of tuples. Fuzzy quantifiers were first introduced by L.A. Zadeh [10] to 
g enrrat^ff the exist ent ial (3) and universal (V) quantifiers. Recently. R. Yager suggested another approach to 
the definition of fuzzy quantifiers [7, 8, 9]. Our aim is to point out some efficient strategies for the evaluation 
of fuzzy quantified predicates, since efficiency is a key point in DBMS's [5]. 

In section 2, fuzzy quantified predicates are introduced along with their two possible interpretations. 
Their use in the framework of an extended relational language is also illustrated. In section 3. we point out 
some interesting properties of the OWA aggregation operator which will be useful in improving the 
evaluation. The evaluation of fuzzy quantified predicates applied to individual tuples and sets of tuples is 
. fiscirtswt in sections 4 and 5 respectively. Starting from a naive strategy based on an exhaustive scan of the 
concerned elements, we point out some properties intended for limiting the data to be accessed (and 
consequently the I/O volume). To conclude, we summarize the main results and draw some directions for future 
went 

II. FUZZY QUANTIFIED PREDICATES AND THEIR INTERPRETATION 
2.1. Tuple and set oriented predicates 

In the usual relational framewoit we can distinguish predicates P applying to individual elements (x's) 
of a set X : 


’/3 - 82 ^ 

/£>/$£ 3 


P ; x e X -» [0,1] 
478 


and predicates whose argument is a whole set X of elements : 

P : X -> 10,11. 

Typical examples of these two categories in an SQL language are : 

"find the employees earning more than $4000" expressed : 

select * from EMPLOYEE where salary > 4000 

and "find the departments where the average of the salaries is over $4000" expressed : 

select dep from R group by dep having avg(salary) > 4000. 

2.2. Fuzzy quantifiers according to Zadeb 

SSh^Euy apply** » sets of tuples tlO) as shown in tile follow*, qoert : 

"find the best 10 


select 10 dep from EMPLOYEE 
group by dep having at least three are middle-aged 


(A). 


giuup MJ 

Afterwards. J. Kacprzyk suggested an adaptation for individual tuples [61 inside queries such as : 

"find the best 10 employees matching almost all of the predicates [middle-aged, really well-paid. ... 1" 
expressed: 


select 10 * from EMPLOYEE . 

where almost-all among [middle-aged, really well-paid. ...1 


(B). 


In both cases, the quantifier is seen as a fuzzy set defined on the cardhwlity ofafuzzy sab example 
A the quantifier is absolute (at least three) and the associated fuzzy set maps R into the umt “ 

Ab sands for £ SS^associated^ "X are D". In 

have a relative quantifier which is represented by an application 
reoSnts a relative quantify, the expression "x matches RQ among IP,. ... . W « ^fined as . 
^ RQ ((2n P ,(x))/n). Possible shapes for the quantifiers used in these examples are given in figure . 


* 



0 1 2 3 4 5 6 7 8 .... 


at least three 



Figure 1. Examples of the representation of two quantifiers. 


2 J. Fuzzy quantifiers according to Yager 

R. Yager recently suggested representing monotonous quantifiers by means of OW A aggregations [91. 

First of all. let us recall the definition of an OW A : 


479 



( 1 ) 


OWA(wi 


, ... , w„, xi x n ) = X ^ w * * Xk ^ 


where xkj is the i* largest value among the x;’s. 

V -rn / , a a\ and X * 1.4. .9. J6, .1 ). We will compute (.1 * 
Example Let us consider the case where W = (.1. 2,3, .4) ana * • • 

5) + 8 •' J5) + (.3 * .4) + (.4 * .1) and we get the value : .37. 

, . sc -rented bv the weights put into the operator, each of which expressing 

ssawCBsa^— -= 



Figure 2. Weights design for a fuzzy quantifier. 


2.4. Comparison of the two approaches 

where at least three elements are C 8 . indicates the extent to which an element satisfies C. 

and |aV.4, b7.7. c7.6. dV-5. eV.8) where ‘he degree indicates me exren ^ wiU ^ considered 

According to Zadeh’s definition, the two^ ^ quantifier> On foe contrary, if we take an OWA 

equivalent whatever the characteristic functi™ c = 1/3 vi > 3. the degree for the first set is : .9 and 

interpretation with weights wi = 1 / 3 . w 2 / . 3 ’ that it w j*i better meet database users' 

U* the «te b : ;7. We ^fe ttesecy ri II* smaalto of such an opeialor 

2.5. Queries under consideration 

More precisely, we will concentrate on the evaluation of three types of queries : 

- tuple-oriented (or horizontal) fuzzy quantified predicates : 

select ... from R where Q among {Pi r B j 

- type 1 set-oriented (or vertical-1) fuzzy quantified predates : 
select ... from R group by att having Q are D 


. type 2 set-oriented (or vertical-2) fuzzy quantified prates : 
select ... from R group by att having Q (C are D) 


in the first case, tire weights 

(depending <h« . q«*r*r «*.»)-—»«» 

following manner [9] : 

• ,spe 1 ; ■<-« *» i* "“** «** “* *»** 

. 2 : i) nmpoK y, - PcM «* «*«■ * » «* * »* ”’ 00|! ,he *' s: 


480 



ii) wj = <k i zy x zi) -<x'x z y £ ^ 

Hi) computtE; = max(M><xj)! Ic<*i>> C are D is seen as the implication C => D; - 
iv) compute £* wi * Efc where Ek, represents the P largest value among the Ej’s. 

In both cases n is the cardinality of the set of tuples concerned with the quantification. For the sake of clarity 

Lid^JtS loss of generality, we will assume that predicates involved in a quantification (Pi p »- c ** 

D) are atomic fuzzy predicates. 

mi * ° f ,ht * gWe " by : 

(1 * 0) + (1 * .0025) + (.9 * .0375) + (.8 * .21) + (.7 .75) - .72925. 

III. SOME PROPERTIES OF THE OWA OPERATOR 

Th-* idea which wUl be developed in the rest of the paper is situated in the scope of the evaluation 
such D reS;/e7?oragiven^sholdxSisfaction degree). The reason tathat is the feet that 
ZZ Ea small subset of toplet l-d »«e fa i TiSiS 

is expressed by means of an OWA operator and we will also take ad vantage of properties of such an operator. 

The OWA is a mean operator and so it has interesting properties : 

OWA(w, w n .Xi.-..x n )Smax(x, x„) 

OWA(w, w n , xi. - . x; x n ) £ OWA(wi, . w 0 , xj. ... . . Xj,!....!) 


OW"A(wj w n . xj. ... . x 

OWA(wj w n , Xj x 


x n ) 5 OWA(W), ... , w„.l, ...,l,Xj,l, ...,1) 
I! ...! x a ) a OWA(w, W n , x, Xi.0....0) 


( 2 ) 

(3) 

(4) 

(5) 


Properties t4) and (5) arise from : i) the monotonicity of any mean operator, ii) the fact that x belongs to 10,1]. 
fKTE£ ttS55STone , «n derive conditions bearing on the x/s which are necessary for the 

satisfaction of the condition : 

OWA(w, w„, x, x„)2X. 


max(xj , ... . x„) 2 X o 3i. Xj £ X 


From (2), one can assert : 

OW>(w, w n .x, xj* X 

From (4). one can derive : 

OWA(wi w n . xi. ... , x n ) > X=> OWA(wi w n . 1 1. x;. 1 1) 2 X 

n - 1 

(X w; * 1) + w n * x; £ X 

i = 1 


( 6 ). 


<=> (1 - W„) + W„ * Xi 2 X 


and finally, we get : 

OWA(wj w n . xi x n )S X 


w . - X + w n - 1 
Vi, xj £ “ 


(7). 


This last formula is valid only if w B is strictly positive, otherwise no implicatioo can befound. Moreover, a is 
only profitable if (X + wj > 1 (otherwise, we have a condition which is trivially satisfied). 


From (3) and (5), we have : 


481 




OWA(w 


and: 


, w„.x 1 .....x i ,l,...l)< X =>OWA(w, W„.x, x„)< X 

w„, X! Xj.0^.,0) 3: X =» OWA(w, w 0 . x, X; x«) 5 X 


(&) 


(9). 


OWA(w.. 

These last two conditions will be used for partial evaluations of an OWA aggregation as we will see in section 

5. 

IV. EVALUATION OF TUPLE-ORIENTED FUZZY QUANTIFIERS 

4.1. Initial strategy 

Let us consider the evaluation of quantified conditions applying to individual tuples of the fonn (in 
' select ... from R where Q among {Pj PJ • 

The principle is to compute the sum Xj (w, * ?,(x)) a naive al 8 ori ‘ hm ** : 

for each r in R do .... 

for i from 1 to n do V[0 = PpjW enddo; 

order the vector V giving V; 

for Tfroni 1 to n do GV = GV + V’[i] * WU) enddo 
if GV 5 X then write(x/G V) endif 

enddo 

4.2. Improvements 

" 0* c“ SSve seal of R) and il is sure Rial no possibly alirfying W* *» *■» <**«• 

Alibis ,2it. w can come back CO pro^nte (2) «Kl (3) and pn>6. from the tenierl fo nmdae(6) and (7) to 
nroceed in two steps • creation of a subset R’ of R by means of a usual boolean condition, then the 
application of the previous algorithm on K. One expected interest of the f || s ‘ s“p is the fa^t d^a regular 
DBMS is able to work efficiently as far as indexes or access paths are available. Formula (6) becomes . 

P,(r) S X or ... or P n (r) 2 X 

and if we assume that each fuzzy predicate Pi is represented by a trapezium on an attribute A;, we finally get a 

condition : .. <a\ 

(r.Ai 6 [ij.Si]) or ... or (r.A e Un-Sn]) v 

where ij and s, are the inferior and superior values associated to the X-leve' cut ofP, In a way similar, formula 
(7) (if (X + w n ) > 1 and w n > 0) leads to the condition : 

(r.A, e (fj-s'il) and ... and (rAt e ti’».s’J) (9) 

where i'j and s'j are the inferior and superior values associated to the ((X+ w„ - iyw„)-level cut of Pi- 

Since these two conditions are necessary, we can combine them together by means of a conjunction However 
\ Z noted that condition (8) is a disjunction and if used alone, it will generally require the entire scan of 

thfrelation R to be executed and in this case, data access is not improved at all. On the otht^and. conditiOT 
(9) is conjunctive and if one of the attributes Ai is indexed, the whole condiuon can be processed with a limited 

number of data access. 

4.3. An example 

Consider the membership functions drawn in figure 3 in the scope of the query : 

482 


select * from EMPLOYEE where . . / j lSM -. lt '>\i 

most [ middle-aged, high-salary, low-commission, medium-sales, aitwnd(nb-childreiu)). 

children(e),2))2 .82 for each employee e. 

In this case, expression (8) becomes : 

ejge € [39,51] or e.salary 2 46000 or e.commission S 8500 (10) 

or e.sales e [1.3.2.7] or e.nb-children = 2. 

Since (X + w 3 ) = 1.18 is greater than 0, formula (7) is applicable, (X + w s - l)/w 5 - .18/.36 = .5 and 
expression (9) yields: 

ejtge e [37.53] and e^alary 2 41000 and exommission S 9800 (li) 

and e.sales € [1.2,2. 8] and e.nb-children 6 [1,3]. 

From a practical point of view, expressions (10) and (1 1) can be used in two “. archilec ‘ UrtS ' 

iSfirst one is a pre-processing based on an explicit query submitted to aregular DBMS such as . 

select * from EMPLOYEE e where eage 6 [37,53) and e.salary 2 4 1000 and 
exommission 5 9800 and e.sales e [11,2.8] and e.nb-children € [13] and _ 

(age e [39.51] or salary 2 46000 or commission £ 8500 or sales e [1 -3,2.7] or nb-children - 2). 

me second one would consist of using these conditions at the internal level of query processing inside an 
extended DBMS able to evaluate fuzzy queries directly. 


low-commisssion 



55 age 


commission 


medium-sales 


around(nb-children3) 


-821 t - 


3M sales 0 1 2 3 4 nbxhildren 


Figure 3. Membership functions used in the example. 

V. EVALUATION OF SET-ORIENTED FUZZY QUANTIFIERS 
5.1. Partitioning of relations and initial strategy 

In section 3, the principle for the evaluation of fuzzy quantified predicates applying to one (or several) 
set(s) of lupS was giveKeLfter. we will consider queries of the form "select att from R group by ait 


having Q SS*". T^'t™>P ^ 

partition) of R where att » Vj : 

for each ii in Rj do V[i] = HoM « nddo; 
order the vector V giving V*; 

MJsJSfite *« toow^ “fa* C^nfinlto of Rj «d«-Dme«t 

SriL. 1 to ndnOV . OV vjm • Win »«o: 

if GV 2 X then write(Vj/GV) endif. 




set of tuples of R 
where the value 
of the attribute att 
is the same (a) 


att 


setof tuples ofR 
where the value 
of the attribute att 
is the same (z) 


relation R 


Figure 4. The "group by" mechanism in SQL. 


5.2. Improved algorithm 

Oo, *. a .« re*- a, 

whether the calculus should c 0 " 0 "^ ^ 1 ? 3 ^ circum ^es : i) when the partition cannot reach the desired 
which encompasses data acc«s) can aopn n two CKun^ancc^ lcvcl (X) and the precise value of the 

level (X). ii) it « certain ttat ^ simS^what is done in the detign of "ay and 

candidates to be examined. 

, l. calculated and in particular its last value w n . Thus, it 

,0, „ o^don (7) »y «P>n ,, »r . in*" “* 

following instruction : ^ ^ < ^ + w> . n / w , then exit endif 

<■ • nartitkms with a large number of elements will lead to a low value for w„ and 



.. , . -i~adv k tn pt«* of a partition (tuples fj to r k ), and the 

Now, let us assume that we have that die fn - kl missing values are 1 and the result of 

values V[i] = UdM for i 6 I 1 (0 f 0tmu ia (8) we can be certain that this partition will never 

the OWA aggregation remains under X, according to tormuww 
iSch the ckS level X. We have to determine the aggregation . 

OWA(wi w n , vj v k ,l....l) = (£ w0+ £ Wn-k + t v j..k*i 

i = 1 i“l 

. 1h9t the valucs V [1] to V(kl are sorted. In addition, the expression has not to be 

A) t„ *. n. ■»«. i. * «* ,0 -nas. 

once again, we can specify a condition likely to stop the outer loop . 
insert Ms(r k ) into V[l:k] ; 

compute A = ("£ WK) + £ W[n - k + i] * VDJ: 
i s 1 i 

if A < X then exit endif 

Whenk«n<lta tuple o( Che p»Mon). the vah« of Ae^ 

to the partition. 

k 

OWA(wi w n , vi Vk.0^.,0) = £ 

i = 1 

stop the outer loop: 

insert gio(r k > into Vll.k] , 

compute B = £ V/[i] * V(i]; 

if B 2 X then write(Vj); exit endif 

Again, when k = n. the value of B equals that of A and is GV the membership degree of the cunent partition. 

We can now give the final algorithm, when n die number of tuples of any partition, is known in 
advance: 

A = 0; compute the vector W; comment W[i] = Q(i / n) - Q«i - D / n> endcomment: 
for each r k in Rj do 

if Un(r k ) < (X + w„ - 1) / w n then exit endif. 

insert podk) into V[l:k] ; comment in decreasing order endcomment, 

A = ("£ W[i]) + £ W[n - k + i] * VU3; 

i«l i * 1. ... 

if A < X then exit endif; 

g _ * v[i] ; comment these two instructions are present only if we 

if B 2 X then writer); are not interested in the membership degree endcomment; 
exit 

endif 

yj? X then wrile(vj/GV) etrtiP. coom.nl ** ‘•a*** M 

endcomment; 


485 


Finally, we have to deal with the last case, namely the so-called vertical-2 fuzzy quantified predicates 
(O r's C are D)' If we look at the definition given in section 2, we can see that the weights are dependant on 
the value of the Hc(*i)’s and it does not seem realistic to perform this calculus without the entire son of the 
underlying relation R. Consequently, the canonic algorithm derived from the definition can be applied and it 
will require the exhaustive scan of all the partitions created by the "group by*. 

5.3. An example 

Let us consider the query 'find the best 5 departments where most of the employees are well-paid" 
which is expressed in SQLf as : 

select 5 dep from EMPLOYEE group by dep having most are wcU-paid. 

We examine a department (partition) containing five employees e 1 to e5 with the following characteristics : 


#emp 

#dep 

salary 

... 

e2 

d 

38000 

. 

e4 

d 

55000 

. 

el 

d 

46000 

• 

e5 

d 

32000 

• 

e3 

d 

48000 

• 


with "most" represented by the function : x -* x 2 . X is set to .73 and "well-paid" is the membership function 
given in figure 5. 



Figure 5. The membership function for the predicate "well-paid". 

So, the fuzzy set well-paid is {.8/el. .4/e2. .9/e3. l/e4, ,l/e5} and the weight vector W is : w, = .04, w 2 = 
.12. w, = .2. w 4 = . 28. w 5 * .36. If we perform the overall calculus for these data (naive strategy requiring the 
access to the 5 tuples), we get : (.04 * 1) + (.12 * .9) + (.2 * .8) + (.28 * .4) + (.36 * .1) * .456: therefore, 
this partition does not match our requirement (.73). Now, let us apply our improved algorithm assuming that 
the tuples are accessed according to the order depicted above. Since (A. + w 5 ) is over 1 ((A.+w s - l)/w 5 = .25), 
the first condition of the algorithm is interesting (not trivially satisfied). 

Access employee e2 : iUm^(e2) * .4 > .25: A = .784 > .73; B = .016 < .73 => the loop goes on 
Access employee e4 : p we jj. p ,i,i(e4) « 1 > .25; A = .784 > .73; B = .088 < .73 =* the loop goes on 
Access employee el: lW P ..d<el) - -8 > -25: A = .728 > .73 is false => the loop stops here. 

In this case we save 2 accesses and if e3 were the first tuple of the considered partition, the loop would have 
stopped immediately, since Pwell-paid(c5) - -1 & under -25. and 4 data accesses would have been saved. 

VI. CONCLUSION 

In this paper, we have dealt with database management systems where conventional data are stored and 
support imprecise queries. More precisely, we have concentrated on fuzzy queries involving quantifiers. We 
have distinguished two main classes of such queries : 1) those where the quantified condition applies to each 

element of a set (x matches Q among {Pi P„l). and 2) those where the quantified condition concerns a 

486 








"1 


whole rel of ekmenis : a) Q rt « Sl^^lSttS ' 

Our objective was to design some S cm the exhaustive scan of the 

the satisfaction degree is given by thc ^‘ f ?J^L? s f the OWA operator allowing for some improvements 
considered set, we have pointed out some proP^ Ky ^ u , n aucry selecting a subset of the elements likely to 

especially regarding data access. Fof'yP® 1 -l, m available, it is then possible to save data accesses. 

he evaluated and if appropriate >ndexes^avaa^c.^ ^ ££ wn> wc teve shown that 

For type 2a queries, where the number of «'«"** whether or not die calculus had to be continued, 

comets could be applied to each element of the set to ^ error” or "branch and bound 

This approach is very similar to the ^ m , mher of data accesses. Consequently, W V*P*“? 

algorithms. The basis of of complexity of the final algorithms has not changed. 

One imeresing resell of this wortcte »“* <" 
queries involving quantifiers. Moreover. fuzzy pre^cate applies to a monototuc aggre^te (sum 

m Soot Xa Hi" pravWof 1 by »r 

references 

„ p Rose. O. Piveri. "Some propeto “ 

«»£ «f sy«™ *«“"». v *"“ (Ao!lm) - 1W2 ' „ „ 

, 4 , o. Dubois. H. Preda, ‘A review of top - «*«•»• *— ^ ' 

1985 

151 M Jarke.J. Koch. -<*»»y op.in.lz.lion in dacha* *«-r. 

' ' ' 

,8, R R Yager. "Conneoives and ,«a»tifiere in toy to". -toy to - Systo.. «. 3«5. Ml. 

R.R. Yager, 'toz, Rooiien, ^ ^ 
Engineering Symposium, Yokohama (Japan), 289-296. 1991. 

[iOi L.A. Zadeh. "A computational approach 10 fuzzy quantifiers in natural langoages*. Computer Maihmalics 
with Applications, 9. 149-183. 1983. 


t 


487 



N98-29570 


A Fuzzy Case Based Reasoning Tool for Model Based Approach to Rocket 

Engine Health Monitoring S/^f— ^ c> 


Srinivas Kiovvidy, Adam Nolan, Yong Lin Hu and William G. Wee 
Artificial Intelligence and Computer Vision Laboratory 
University of Cincinnati 
Cincinnati, Ohio 45221-0030 


f - i 


I. INTRODUCTION 


Tel: (513) 5564778 
Fax: (513) 5567326 
Email: wwee@uceng.uc.edu 




r 


One of the main requirements of a rocket engine Health Monitoring (HM) system is 
its ability to recognize potential failures of all kinds such that catastrophic failures can be 
avoided through cutoff and other less catastrophic failures can be avoided through repair 
works. The HM system must have the ability to learn new situations and be able to recognize 
potential failures. The behavior of key SSME performance parameters vary significantly 
depending on engine power level and changing interface conditions (Nemeth et aL, 1990, 
Millis 1991). Parameters included in this list are turbine discharge temperatures, other 
turbopump inlet and discharge temperatures and pressures, turbopump speeds, propellant 
flow rates, and valve positions. Therefore, a model based approach is well suited to identify 
dynamic, nominal operating values. In real HM operation, we are always confronted with 
uncertain data, data where event of physical failures occurs. A fuzzy set approach (Kosko, 
1992) to describe this data is most logical. 

In the recent years, researchers are investigating a new paradigm for problem solving 
and learning, by using specific solutions to specific situations (Riesbeck and Schank, 1989). 
The basic idea is to make use of the old solutions while solving a new problem, and such an 
approach is known as Case Based Reasoning (CBR)(Krovvidy & Wee, 1992, Riesbeck & 
Schank, 1989). A model based approach is found to be one of the useful approaches for 
designing planning systems (Birnbaum et al., 1991). Currently rocket engine protection 
consists of redline systems that issue an engine cutoff if measured value exceeds a pre- 
determined operation limit for any of several par-meters (Millis, 1991). More recently efforts 


488 



fSfri S5SS8**" 


< 



% 


am being made to develop nn advanced tomewo* to a toinre demcnon system wid. ibe 
addition of model bated algorithms (Hawman e. al„ 1991). 

In this naner we develop a fuazy case based reasoner that can help building snch a 
model L oid case's and any ex, sting domain knowledge A detailed system desenpnon ts 

presented in this paper. 

n. PROBLEM ST VTEMENT AND SUGGESTED APPROACH 

i o f„„ v rase based reasoner that can build a case 
l ltion ^a™ues detected, and develop case retrieval methods that 
rcP T”s«i to index a relevant case when a new problem (case) is presented ustng fnzay sets. 
““ . f fu „ v sets is justified by the unceriain data. The new pmblem can be solved 
The choice y the old cases . This system can then be used to 

using cases and use this genemlization to refine the 

!«ng Z ™odel definition. This in turn can help to detect failures using the model based 

algorithms. 

III. SYSTEM DESCRIPTION 

The purposed Fussy Case Based Reasoner (FCBR) is depicted as shown in dm Ftgum 


1 

i 

I 


1. 


[ 

Juslifier 

t 



Case Base 

d 

Learner I 

Modifier 

Stoier | 


Retriever 

* 

j Figure 1. Proposed Fuzzy Case Based Reasoner 


a - u " t zr rr£ 

cith w^ained ^learned) or decided (tested). This case definition allows a decision to be 
^ made every T data width of m samples. It is therefore possible to generate a 

r c «::r; :erva, 4T . ^ » m 


489 






kAT with k « m can be used as Vision internal- We need to be sure that m samples 
over T seconds is enough to model both the engine smrt-up and shut-down, and mam sage 

operation. 

„ is very difficult to give a definite relationship between dam collected and the fault 
occurring at a given time and at a specific location because of many uncertatnnes A 
monotonta fuzzy number is modelled using time and sensor locanon for each case. Tins 
in discrete t i m c space will be used in generating both training and testing cases. In 

gaining a fuzzy number from 10-11 is generaed and associated with the data of an ^n bym 
th . L„, phase, we need to consider multiple decisions under dtfferent tune 

scales^ When a new problem is given, we use FCBR to find die closest case from the *”*”**“* 
cal. We will predict the chances of failure at different future time periods and then propose 
a general decision scheme for the given case from those predictions. 


IV COMPONENTS OF FCBR 


11 Retriever, Modifier, Justifler, Storer and Learner: In diagnostic design 
retrieval should be done based on the qualitative description of the problem and the causal 
relations in the explanation of the design soludon. The indexing mechantsm must ato .How 
* to access cases a. any level of me mpresentadon using fuzzy se, theorems. Th e ente rta 
mmare used to evaluate whether a case is similar enough to the current destgn problem 
should use the salient features of the domain. In the SSME problem, we must use the sensors 

data to evaluate the applicability of an old case for a new problem. 

If me retrieved solution is not acceptable me Modifier tries to adapt and synthesize 
different parts of the design into a solution using fuzzy sets. The Modifier can help usto 
suggest the necessary changes to be made in me dynamic modelling of me fa, lure. The 

Justifier justifies the suggested solution. 

The Storer stores me case. When a soludon for a given problem is obtained it mm be 
stomd in the case base for future retrieval. When a set of sensor data ts thagnosed for Tatlure 
prediction, that dam within the defined window (n b, m madix, must be smred - 
base The cases would be diagnosed based on some monotontc fuzzy number. Thrs alu 
would define me chances of failure for that pariicular data set. Themfom. the cas« amsmred 
using fuzzy se. concepts. The Learner men develops generalized solemn sda.eg.es tom th 
storrf fuzzy cases. This is particularly imponan. because of the enommus amount 


490 



generated by the sensors. Therefore, we win develop genemliration methods .0 take .several 
cases and repreren. them in some form of roles so that we can comm the sue of the case 

base. 


V PRELIMINARY EXPERIMENTAL RESULTS 

Some preliminary experiments are performed using the data from several sensors. In 
particular, we selected 4 seasons and defined a fussy case based mason, ng system. 


1) Data sets selection 

in general the cases are defined based on multiple sensors. Our current test is 
restricted to the problem of detecting faults in the HPFTP (High Pressure Fuel Turbopump). 
Four sensors are selected and listed in the table 1. 


ID 

P1D NO. 

LABEL 

1 

7 

MCC Pressure __ 

2 

17 

HPFT Discharge Temperature 

3 

77 

MCC Hot Gas Iniector Pressure A 

4 

78 

MCC Cook t Discharge Temperature B 


Table 1 . SELECTED SENSOR FOR CASE STUDY 

The data sets of test 902-457, 902-463, and 901-463 are used in our current 
study. (Hawman e. al„ 1900) The rests 902-457. ami 902-463 are two nominal data sere 
with no shutdown. The test 901-436 was reported wrth having a problem of HPFTP 
coolant liner buckle. It was shutdown due to a HPFT dischtuge tempentture redime a. , 
= 61 1.035 seconds. 


2) Case definition 


491 





The sampling rate is defined as 0.04 which means TA).04 samples are generated 
in T seconds. A case is defined as the samples generated in T seconds. In particular this 
is represented by a T x 4 vector. All the cases obtained from test 902-457 and 902-463 
are considered to be safe. The cases obtained from test 901-436 has varying levels of 
failure modes. In other words the cases collected well before the breakdown have a low 
possibility of failure while those cases closer to the breakdown have a high possibility of 

failure. 


3. Normalization of the data 

Since the value of sensors highly depends on the power level, a normalization 
procedure corresponding to the power level is applied. The MCC pressure (MCC_PC) is 
proportional to the power level. It is used to define the measurement of power level. 
The power level (LP(t)) is defined as a ratio of MCC_PC value with predefined 
standard MCC_PC value (MCC_PC_STAND), LP(t) = mCC_PC_STAND • ^ 

corresponding sensors level CSL(i,t) ~ S Sen°sorj 5 tand(i/ can ** est,mated b y some 
polynomial functions of power level ( PL(t) ) as follows: 

CSL(i,t) = Cl*PL 3 (t) + C2*PL 2 (t) + C3*PL(t) + C4 


The coefficients (Cl, C2, C3, C4) are obtained based on the nominal test data 902-457 
with linear regression technique. The Sensor_stand(i) is standard value of sensor i 
which is predefined based on data from the nominal test 902-457 corresponding to the 
predefined standard MCC_PC value. The normalized value of each sensor are computed 

as follows: 

Sensor_ Value(i,t) 

norm(i,t) - s ensor _ s tand(i) * CSL(t) 


For our defined case matrix 



A , “ ( a il’ a i2* a i3’ 


normalized average percentage error (APERR) is defined as the case index. 


a iN ) . a 


1 4 

APERR(t) = - 1,11.0 - enorm(i,t)l 
4;=i 


492 


I 



where 


enorm(i.t) is average norm of sensor i within N data points 

k _N 
1 k- 2 

enorm(i,t) = — £norm(i,t + kT) 

N k=-f 

N is number of samples within the window, T =0.04 is the sample rate 

The cases are grouped such that they are classified into one of the categories { high 
risk, moderately risk, low risk and no risk). The cases are stored in a case base. The retrieval 
from the case base is done using a hierarchical indexing. At the first level, we take the sample 
and compute its APERR. There we will be retrieving all those cases with a similar APERR. 
In the next step, we use a function defined on the first sensor data. The matching is continued 
until we identify the group to which the sample belongs. After obtaining the group, we can 
associate the possibility of breakdown with the new problem same as that of the identified 
group. The grouping of different cases is shown in the Figures 2 and 3. This has been 
identified as the primary index. With more sensors we expect to develop several such indexes 
and also more categories of cases. We also want to compare the results with other methods. 


493 



Figures 2. Grouping of cases in Breakdown and nonbreakdown data 

The methods proposed were presented in the context of specific sensor data set 
analysis The primary reason for this is to be able to compare recent performance (Hawman, 
et al 1990) of regression analysis and linear predictors to that of the fuzzy case based 
reasoner With adequate performance FCBR will be utilized as sensor models for sevend 
Z dimeters deemed relevant by the 1990 sensor study (Career et al., 1990). This wtll 
enable the development of a fault detection system which would be less complex and more 
accurate than previously proposed methods. 

The application of these methods are not isolated to SSME data. Success in this study 
implies wide ranging application to all engine monitoring systems. 

V REFERENCES 

sa. aa-agag- "'' 

Rep0rl ’ NASA-CR-185224, 

^it^ T^hnologies Research Center, March 1990. 

Kosko, B., Vf rrrr' N "»"> rks anfi Svsieatf, Prentice HalU 199Z 







Krovvidy, S. and Wee, W.G. Wastewater Treatment Systems from Case Based Reasoning, 
Special issue on Case Based Learning of Machine Learning. (1992, in press) 

Millis, M.G., Technology Readiness Assessment of Advanced Space Engine Integrated 
Controls and Health Monitoring, Proceedings of the Third Annual Health Monitoring 
Conference for Space Propulsion Systems, Cincinnati, Ohio, November 1991. 

Nemeth, E., Maram, J. and Norman, Jr., A.M., Health Management System for Rocket 
Engines, Proceedings of the Second Annual Health Monitoring Conference for Space 
Propulsion Systems, Cincinnati, Ohio, November 1990. 

Riesbeck, C.K., & Schank, R.C. Inside Case-Based Reasoning Lawrence Erlbaum 
Associates, Publishers, Hillsdale, New Jersey, 1989. 



UNCLAS 



N93-29571 


A High Performance, 

Ad-Hoc, Fuzzy Query Processing System 
for Relational Databases 

William H. Mansfield, Jr. 

Bellcore, USA 

whm@thumper.bellcorexom 

Robert M. Fleischman** 

BBN, USA 

rmfi@diamond.bboxom 

ABSTRACT 

Database queries involving imprecise or fuzzy predicates are currently an evolving sea of acyVyij p and 
industrial research [Boc87,Bosc88,Prad87,Tah77,Uma83,Zeni85]. Such queries place severe stress on the 
indexing and I/O subsystems of conventional database environments since they involve the search of large 
numbers of records. The Datacycle™ architecture and research prototype is a database environment that 
uses filtering technology to perform an efficient, exhaustive search of an entire database. It has recently 
been modified to include fuzzy predicates in its query processing. The approach obviates the need for 
complex index structures, provides unlimited query throughput, permits the use of ad-hoc fuzzy 
membership functions and provides deterministic response time largely independent of query complexity and 
load. This paper describes the Datacycle prototype implementation of fuzzy queries and some recent 
performance results. 

1. Introduction 

In relational database systems [Codd70] databases contain tabular representations of information where rows 
represent database records (tuples) and columns represent fields (attributes) within the records. Relational 
algebra defines operations that can be carried out to specify particular query requests in which att ri b ute 
values and Boolean logic are used to identify sets of records of interest. Structured Query l-an gny - (SQL) 
is a query language that defines the grammar and the user interface between an application and the database 
management system. In SQL, database data retrieval operations are defined in select statements of the form 

Select attribute -list from relation where predicate 

where the attribute-list identifies values to be returned to the user, relation identifies a particular table in the 
database, and the predicate identifies a search criteria consisting of Boolean expressions involving attribute 
names and values. One characteristic of these queries is that the user must be very familiar with the contents 
of the database, from both the perspective of structure, as well as the value range for particular attributes. 
Mechanisms to introduce meaningful imprecise terms into the predicate such as young, old, high, and low 
do not exist. 

Fuzzy set theory [Zad65] has been proposed as one method for introducing imprecise queries into database 
systems. Efforts have been made to pre-process imprecise requests [Gala91],[East87] into a relational query 
language such as SQL or QUEL where a request for young employees might be translated into a range 
request for employees between the ages of 20 and 30. 

Membership functions provide the method to translate an attribute value to a degree of membership in a 
fuzzy set, referred to as a possibility value. Figure 1 shows membership functions that map age values into 
the fuzzy sets YOUNG, MIDDLE AGE and OLD. Ages less than 15 arc definitely members of the set 
YOUNG and have possibility values equal to 1.0. For ages between 15 and 25, the degree of membership 


5s~- 82 s 

/-tags' 

P -(0 


** 

TM 


Work was performed while the author was at Bellcore. 
Datacycle is a trademark of Bellcore. 


496 


J-. . 


551 £ ^JS&SSSl 

that over 20 ages arc not members of the set YOUNG. 


YOUNG 


MIDDLE 

AGE 


U U 1 5 10 15 20 25 30 35 40 45 50 55 60 

Age in Years 

FIGURE 1 

Membership functions for YOUNG, MIDDLE AGE and OLD 

sma ^ t daub*, d«ri»g «w, pr«*»in«. TO, 

executing the membership «"k*h agmiw^^ P w m|tvtr rf fuzzy predicates. Arbitrary queries 

approach sUows I r f Jj{Si2tex strucSes and force the run-time execution of 

and union operations involving large sets. 


implement. While the index approac prov f ^ I(| contrasli w maximize query 

limiting the user to a smaH numtx pre that «ui be specified within the query grammar 

flcxiba & % SSX5SSi ’SSS! */«*? w «*** *« 

is required. Uang lOtDaU^ap for a mlimite d number of concurrent users. 

?K£S ^SSS Stain of membaship functions in the query grammar, artntrary use of 
numeric attributes in the database, and high performance. 

divide the primitives tp^nTtS*^ varied 

within thc app ’ 1 f^‘* n 3 n.,« the^nembership functions over time to the underlying database, or to 

of this work i, te dyiamic modify of n™b»d,,p 

functions to permit their use over very different data attributes. 

*55£»,£S2SS5S2n^ =fc — 

based'ooinb approach. «id Section 7 offers our conclusions. 

2 The Batacvcle™ Architecture and Research Prototype 

-j-aE5SSE&1^^ 


497 


relational (crisp) queries. Section 3 then builds on the architecture description and characteristics to describe 
our implementation of fuzzy queries. 

sis^iss'ss 

example of the architecture's flexibility. 

ap P ears .f 1 * T S ?nf ” some data manipulation primitives providing n<»-wditK>Ml 

usmg ANSI SQL tefmS^^gedsolel y in of the relational schema and the 

•^JlSd «^5£tion of attributes. Thus.if stable 
penalty. 

_ rwa-veu system model is depicted in Figure 2 and comprises of an arbitrarily large number of 

225? “r'Ss 

miSnixocessors whose architecture and instruction set are optimized for synchronous, high speed 
specific MKTOp ro ce s sore contents of the database on the broadcast channel provides the 

search. The presence ha.«d on the values of any attribute or combination of attributes, 

»*«o *>*“°" communication <arfia.Hm.xes 

mamgem « be geographically disribuicd over wide areas Daiae^se scaling is achieved us^ mullipk 

pumps and their assf^iaiericommunication and filtering s ubsystems. 

' Br oadcast Media 


Pump 


Update 

Manager 


I 


Access 

Manager 


I 


I 


Access 

Manager 


SQL 


Access 

Manager 


manager i ■ 1 i, ... . 


SQL 


SQL 


Internet 


SQL^ |SQL ^SQL 


Application 


Application 


Application 


FIGURE 2. 

The Datacycte Architecture 


The custom VLSI datallers have an instruction set that is optimized for Boolean 

ine custom u««. r . ., u.-rr.-ej devices allowing random access in the foreground boner 

aSSESI^ « * «■? tewl - 

oriented. Thirty-two instructions can be executed while an individual record bj presentm the 

£Jfcr. A single instruction is sufficient to complete a 4-Byte comparison and mark a rec ord for selection. 


498 









associate it with a specific query, and initiate output. Complex, multi-predicate selections or several 
independent selections can be performed simultaneously within the filter within the 32 instruction 
constraint. The datafilter instruction set includes arithmetic instructions that operate an integer data values. 
The ability to calculate numeric functions based on database contents provides the primitive 
necessary to peifotm membership functions on-the-fly while data records are present in the filler. 

In the Datacycle experimental research prototype, the storage pump is implemented in a 32 - 128 MByte 
dual ported, banked RAM that allows the storage contents to be read sequentially far broadcast while 
portions of the database arc available for update operations. The memory contents ate broadcast over a 32 
bit wide communication channel at S3 MBytes per second. A 16 MByte database will appear on the 
broadcast channel once every 3 seconds mid the system will offer the user about 1 second response time for 
selects against the database. A 32 MByte database provides storage for 256K 128 byte tuples. 

The content-addressability and full database scan, coupled with the flexibility of the filtering lynainn 
permit a variety of database selection operations that are particularly troublesome to conventional Him—- 
system approaches. In an Operator Services (telephony) application setting, we have included longitude and 
latitude information for every customer in the database. Multi-dimensional range searches can be completed 
in a single broadcast cycle and spatial queries including CLOSEST (find the nearest object in the database) 
can be dealt with in two passes (one to identify the object and a subsequent pass to retrieve it). The 
CLOSEST function requires that a distance function be calculated on-the-fly within the filler. This 
calculation is representative of a larger set of se’sctkm operations that perform arithmetic transformations 
on one or more attributes. In conventional systems, these transformations often negate the advantages of 
traditional database index structures forcing a full database scan requiring extensive processing and ranging 
extreme response time delays. In the Datacycle architecture, since a full database scan is always performed, 
variations in query complexity are often handled in constant response lime 

3. Fuzzy Queries within the Datacycle Experimental Prototype 

We have recently completed an investigation of fuzzy query processing in the Datacycle architecture. Our 
work has centered on storing crisp database values and applying fuzzy query predicates during selection 
operations. Fuzzy requests define algorithmic membership functions that map the value of a n»«»hw 
attribute to a degree to which it meets a fuzzy predicate. Fuzzy selection predicates include imprecise 
qualifiers such as near, high, old, best, tall, etc. Several of these membership functions can be combined 
using fuzzy logic to identify data objects that best meet a number of vague or imprecise selection 
specifications. For example, a fuzzy da t a ba s e request may ask for circuits with a high signal-to-noise ratio 
and a low maintenance history that terminate near a particular location. Such requests place a high degree 
of stress on the indexing and I/O operations in conventional database systems because they force the system 
to consider large numbers of tuples in a search to find an optimal, or some number of "best" matches. The 
Datacycle filtering primitives can perform the efficient evaluation of membership functions and the fuzzy 
logic necessary to combine them. 

In the Datacycle prototype, we have chosen to utilize SQL extensions based on fuzzy queries consistent 
with previous fuzzy query grammars [Buc83,Kac89,Tah77, Zem85]. We have extended the grammar to 
allow the dynamic definition of membership functions from the application level. 

Select * from R where att is qual alt * attribute name 

qual * fuzzy term 

The extended SQL query to select all the records for individuals in the fuzzy set YOUNG would be: 

Select * from R where age is YOUNG 

The fuzzy term can be defined as a trapezoid as depicted in the general case in Figure 3. The breakpoints 
{A, B,C,D} define the range (support) of the membership function. For cases other than the general case 
we have chosen to use a strict positioning of variables and the use of nulls for unspecified parameters (eg 
{„C,D}). Other more natural alternatives [Zcm85] for specifying these functions have been suggested. We 
chose this explicit notation to simplify parsing during query processing. 


499 



SShS*SS2i^SKSy^^>s. “« btobpoim, mo be specified widlie to Emmmor. UnB 
£^m!!£ f. jSJSid of to (uz*y tom TOI/WC. ,w ■" «l« »«« «««>'« •“ 

the young individuals named Smith would be 

Select * where name*smith and age is ( „t5J5} 


General Case 




FIGURE 3 

Trapezoidal Membership Functions 

•n*. nmtntvne summits multiple fuzzy predicates during a single selection and combines the results of 
STlStSLS stodmd to, ope-** «* to, «* «• -> 
operations. 

Select * where namfsmith and age is YOUNG and height is TALL 

Due to the characteristics of the current datafilter. membership functions are limited to Pjewwise linear 
, T,.. «tri r »ion is due to the lack of a muluply mstrucuon in the VLSI datafilter. Our 

i mr .|o mpntafmn uses repetitive addition to emulate a multiply instruction. The combination of mulup e 

ovolappm* tonmns cait be used to opp^imme *** ■» complex 

functions. 



1.0 


0.0 


10 



Select • where age is { „ 15 , 20 }_fit_age is { ,,10,25} 

The ceWtinn macess permits both hardware and combined hardware-software filtering. TheVLSI datafilter 
Sf theSSSKucing the amount of information presented on the 

VO bandwidth the downstream database processing environment can manage. Where possible, it is usual y 
advantageous to complete this filtering operation in the VLSI datafilter. Where the complexity of the 
requesTexceeds the capability of the datafilter, a partial predicate, or an approximation can be used in the 


500 


datafilter, and the downstream software can complete the predicate or apply a precise operation. For 
instance, we have approximated a distance function in the datafilter with an approximation Dxy* txf + lyi. 
For some applications, this is sufficient We use this distance calculation in a fuzzy near predicate. Fot 
those applications requiring a precise distance function, the approximation can yield a superset of the answer 
set and downstream software can apply a Euclidean distance function to identify correct tuples or a correct 
ordering of tuples. This technique achieves greater flexibility and permits applications where the selection 
predicates exceed the capacity or primitives of the VLSI datafilter. Using this technique it is possible to 
approximate non-linear functions with piecewise linear functions, and subsequently apply the precise non- 
linear function in software outside the datafilter. 

4. Dynamic Fuzzy Queries 

One problem with a static predefinition of membership functions (i.e. YOUNG is less than age 25) is that 
the binding may make sense relative to the domain of the attribute in general (over all age groups), but for 
specific cases, may make no sense at all. For instance, suppose we were to apply the fuzzy predicate 
YOUNG as defined in Figure 1 to either elementary school children or nursing home adults. The definition 
is totally inappropriate. To partially overcome this shortcoming, we have implemented a dynamic fuzzy 
predicate which defines the membership function in terms of statistics and dynamically adjusts the function 
to the domain of the predicate. In this case YOUNG is defined in terms of percentiles of the domain space 
and interpreted as definitely young in the first 10th percentile, decreasing in membership value for the 10- 
20th percentile and not YOUNG beyond the 20th percentile. When applied to the domain of the predicate, 
the membership function is scaled appropriately as depicted in Figure 4. In the case of the current Daiacyde 
prototype, the domain can be obtained by simply determining the maximum and minimum values of an 
attribute given additional predicate constraints (elementary school or nursing home). This can be 
accomplished by observing the data stream on a single cycle prior to the actual fuzzy selection. Using 
multiple filters or additional cycles, data distributions can be obtained if severe data skew is present and 
needs to be taken into consideration. The select statements indicated in Figure 4 show the extended SQL far 
the. dynamically scaled requests. This approach does not create membership functions as in [Kam90], but 
rather, transforms "existing” membership functions to different populations. 




Age in Years 

select where age RIS YOUNG and groupselementary 
select where age RIS YOUNG and groupsnursing 


Figure 4 

Dynamic Fuzzy Query Processing 





5 . Performance ■ tuntes (32 Megabytes) to exercise the fuzzy query 

We have generated a modest can <■■*» !^. <tottbas “ 

functionality and populated J unction of database broadcast cycte 

with linear degradation in respotw tune(^«» ^ prototype is 128 Megabytes in a single 

!^i!SSS^2?Swith 1 million 128 byte tuples because the 



pumps and utilizing multiple filters. deStrated the on-the-fiy calculation of two 

SSSer the s*e» foe to® m und «nk- 

membership functions and the fuzzy ^ agfxilhm is to project resulting membership 

within the VWIdatafU^ LtS^n-the-fly. A subsequent query 

values to software outside the VLSI vmerewer^i ^ # ^ Cunently wc select up to the best 5© 

executed against the da ^ r ^ W .^ nT< n r .. r ^ rr ^ific records in a single cycle, and the max of 50 

"*"*'"* 

selection of any number of records. 

Fwm5pro .^ **«* ^ 

debase sizes (256K. 128K and64K to deal wiSh arbitrarily large mimbers 

as is the query throughput Res^^e^ sm^MK up ^ 64 K tuple curves and shows that 

with the full ad-hoc capatodrty- A .***A**^ SSent of complexity. The two predicate curve 
both response time and functions operating on different a ttribut es 

represents processing a *l^«b«^or^ra^m^ra^ AGE is YOUNG) and the combination 

(SELECT concurrency (1-3 queries per second). *e 

of their results with fuzzy logic operators. . --xw. The response time represents the ume 

response time is nearly identical s0^iS.^Sid^Sc >U lo^Ktafilter instructionbuffer. 
necessary to receive an extended SQL requ*^ iwse P valucs , select individual records on a 

E».;ciir<; for scaling and marginally impacts performance. 

With multiple fl^es 1"“. «“* "i*" kvds 01 ^ 'SS 

hoc fuzzy queries. 


502 



5 






1 

F 

§ * 

II 


H 



Lines - Single Fuzzy Predicate 
Points ■ Two Fuzzy Predicates 


128K Tuple 




.o 


64K Tuple 


2 3 

Throughput 

{Queries/Second) 


Figure 5 

Fuzzy Query Performance » Single Filter 
Select * where Narae=Lee and height is tall 


6. Future Work 

Our work to date has centered around prototyping basic query functionality and implementing the dynamic 
fuzzy query capability. The fun database search and content-addressability characteristics of the Datacycle 
architecture make it particularly attractive for a number of further extensions. 

r>! ulti-dimensional membership functions 

Membership functions involving more that one attribute can be dealt with efficiently as a small change to 
the system since the values of all the tuple’s attributes are available during the run-time evaluation of a 
membership function. Thus planar surfaces defined asZ = cjX + c 2 Y + 03 area possible alternative to 
lattiw functions. Using multiple datafllters, multiple intersecting planes such as those depicted in Figure 6 
are We are particularly interested in spatial and directional issues such as north and the 

combination of direction and dstance. 

Hedges 

A mechanism for modifying membership functions with standard hedge [Zad72] terms as very and 
somewhat needs to be addressed. These operators typically involve applying non-linear functions like the 
square or square-root of a membership value. These operators may be approximated using piecewise linear 
functions. 

Concurrency Control in u Fuzzy Transaction Processing Environment 
The Datacycle architecture and research prototype includes the implementation of a full transaction model to 
guarantee <fau»haw» consistency and query correctness in the face of concurrent transaction execution. The 
implementation includes optimistic concurrency control and a predicate based conflict detection algorithm 
that may prove advantageous in identifying conflicts between fuzzy transactions where standard record based 
inriring schemes may be inappropriate. Relaxing strict concurrency control serialization requirements by 


503 




^SSLSSmSMStSS tat include fozzy predicates by re^xecming 
^i,SSidSS icdvi^ coupled with a threshold set to a particular conflicted. 
^S^^StliSSSd be idSitifiS during the optimistic concurrency control conflict 
detection phase, and only those that represent high degrees of conflict 


W 


S 



Figure 6 

Two Dimensional membership Functions 
North = F(Alongitude, Alatitude) 


C ‘”’ Ux digtol P»«™ CBSW 1«? 

of the custom^VLSI daufilt er. ITiese processors include support for rtogtog potnt arrf pro'll! f 

ismIa ac Af\ nmuvuvvuids This is especially important since the current VLSI datiultcr 

S5T»t ^cooK- • *— ■— * •• 

complexity of membership functions that are executed on-the-fiy. 


V/e have^oeculated that the combination of on-the-ily membership function ' weeution ^ ni^terriiip 
Sncton definition at the query grammar level provide the primitives for an adaptive feedback median sm 

and eventually learning. 


1: Datacvde architecture’s full database broadcast and efficient filtering can be used for 

The combination oftne operations. This flexibility permits various applications to 

gSg l^Zull^ZZ^ views of iheilau (full ^■* ***1*, ).°' 
a user to orocess searches beyond the capabilities of current databasc management systems. The 
Swt mvirted in this na«*»r resulted in a fuzzy query capability in a high volume query processing 
work reported in ... G f this work include a scalable query environment for fuzzy queries 

agair^small ^S s^Ss (S-order gigabyte), the use of on-the-fly memterdtip 
execution that permits ad-hoc fiizzy queries, and the definition and implementation of dynamic fuzzy queries 
that adjust a «a«««tirai membership function to the attribute domain. 


Acknowledgments 

We would like to thank other members of the Datacycle prototype research team mcludii 

Munir Cochinwala, Kasey Lee. and John Raitz, and to Bill Buckles for his advrce and assistance while we 

were identifying and pursuing fuzzy database issues. 


504 


References 

[Bosc88] 

[Bow91] 

[Bow92] 

[Buc83] 

[Buc87] 

[Codd70] 

[Dub80] 

[East87] 

[Gala91] 

[Her87] 

[Kac89] 

[Kam90] 

[Lee91] 

[Lee91] 

[Prad87] 

[Tah77] 

[Uma83] 

[Zad65] 

(Zad72) 

[Zcm84] 

[Zem8S] 


Bose, P.. Galibourg, M., Hamon, G., "Fuzzy Querying SQL: Extensions and 
Implementation Aspects" , Fuzzy Sets & Systems. Vol 28, pp 333-349, 1988. 

Bowen. T.F.. nopal. Herman, GE.. Mansfield, Wil., "A Scalable Database Architecture for 
Network Services", IEEE Communications Magazine, January 1991, pp 52-59. 

Bowen TP., Fleischman, R.M., Herman, GJE., Mansfield, W.H., “The Datacycle 
Architecture and Research Prototype: A Quantitative Analysis”, Research Issues In Data 

Engineering, February 1992 . . 

Buckles, b!p.. Petty, F. E., "Query Languages for Fuzzy Databases, Management Decision 

Support Systems, pp 241-252, 1983 . 

Buckles, B. P., Petty, F. E., "Generalized Database and Information Systems", Analysis of 
Fuzzy Information, Vol 2, pp 177-201, 1987. 

Codd, EP„ "A Relational Model of Data for Large Shared Data Banks". Communications of 

the ACM. June 1970 ^ 

Dubois, D, Prade, H. "Fuzzy Sets and Systems: Theory and Applications , Academic Press, 

^ppmy he s to Approximate Retrieval in Database Management Systems", 
NAFIPS 87: Proceedings of the North American Fuzzy Information Processing Society 

Workshop, May 1987 . . 

r.aia c Chawala, D., Eastman. C., "Combining Fuzzy and Nonfuzzy Approximate Retrieval 
in a Database Management System", NAFIPS 91 Proceedings of the North American Fuzzy 
Information Processing Society Workshop. May l" 1 

Herman, G.E., Gopal, G., Lee, K.C., Weinrib, A. The Datacycle Architecture for Very High 
Throughput Database Systems", Proceedings of the ACM SIGMOD. 1987. 

KacpraykTj., Zadronsky, S., Ziolkowski, A.. "FQUERY IU+: A Human Consistent 
Querying System Based on Fuzzy Logic with Linguistic Quantifiers", Information 
Systems, Vol 14, No. 6, pp 443-453, 1989. _ . 

Kamel, M., Hadfield, B.. Ismail. M., "Fuzzy Query Processing using Clustering 
Techniques”, formation Processing & Management. Vol 26 No. 2, pp 279-293, 1990. 

Lee, KC., Matoba, T., Herman, G. E„ Mansfield. W.H.. "A Rapid Turnaround Desgnofa 
High Speed VLSI Search Processor", Integration: the VLSI Journal, Vol. 10, pp. 319-337, 
1991 

Lee, K. C., Matoba, T., Mak, V., "VLSI Accelerators for Large Database Systems", IEEE 
Micro, December 1991 

Plade, H., Testemale, C.,"Representation of Soft Constraints and Fuzzy Attribute Values by 
means of Possibility Distributions in Databases", Analysis of Fuzzy Information. Vol 2, pp 
213*229 1987 

Tahani, Voliollah, "A Conceptual Framework for Fuzzy Query Processing - A Step toward 
Very Intelligent Database Systems" , Information Processing A Management, Vol 13, pp 289- 
303, 1977. 

Umano, M., "Retrieval from Fuzzy Database by Fuzzy Relational Algebra", Proc. IF AC 
Corf, on Fuzzy Information, Knowledge Representation and Decision Analysis, pp 1-16, 
1983. 

Zadeh, L.A., "Fuzzy Sets", Information and Control, Vol 8, pp 338-353, 1965. ^ 

Zadeh, L.A., "A Fuzzy-Set-Theoretic Interpretation of Linguistic Hedges", Journal of 
Cybernetics, Vol 2, pp. 4-34, 1972. 

Zemankova. M., Kandel, A., "A Fuzzy Relational Database - a Key to Expert Systems. 
Interdisciplinary Systems Research, VeriagTNV, Rheinland, 1984 

Zemankova, M., Kandel, A., "Implementing Imprecision in Information Systems , 
Information Sciences 37(1,2,3), pages 107-141 


505 





N93-29572 


GENETIC ALGORITHMS IN ADAPTIVE FUZZY CONTROL 

By S/ £?-'£> *7 


C. Lucas Karr and Tony R. Harper 
U.S. Bureau of Mines, Tuscaloosa Research Center 
P.O. Box L, University of Alabama Campus 
Tuscaloosa, AL 35486-9777 


ABSTRACT 


/(? 




1 


. <5 R lireau 0 f Mines have developed adaptive process control systems in which genetic 

C* As are search al. 0 rithms that rapidfylocate 
algonthms (G As) are ^ to <m«m« ^ fey ^ procedures of natural genei.es. 

near-optimum f ^ e ^ ently nJLuUte a problem environment by modeling the "rule-of-thumb' 

FLCs “* decision making. Togeth^GAs and FLCs possess the capabilities necessary to produce 

stn “» , USe l ,n ‘“ d “ us t adaptive control systems. To perform efficiently, such control systems require a 
P^’^^’manipulate thfproblem environment, an analysis element to recognize changes in the problem 
control dmMto pa P ^ ^ t membership functions in response to the changes n the 

^bte^vlronment. Details of an overall adaptive control system are discussed. A specific computer-simulated 
system is used to demonstrate the ideas presented. 


INTRODUCTION 

__ . , re nrftr# cc control has never been more important than it is today because of economic stresses 

SdSTy processes of increased complexity and by intense competition in a world market. No industry 
forced on mdushyttf piw««es^^ ^ renMin evea tradltiona i industries such as mineral processing 

is immune to tb ® co ^ ) * j enBinee ring (Fogler, 1986), and wastewater treatment (Gottmger, 1591) 

(Kdlymd coa-coltiiis mrasnra. 'coa-cimiiit gwwrall, requires the ImplemoUlron of 

emerging techniq^K changing process dynamics. Such systems prove difficult to control with conventional 

£STCS!S22^ "——.a—— 

tools Employed for process control can be unduly complex even for simple systems. 

order to accommodate changing process dynamics yet avoid sluggish response times, adaptive control systems 
In order LJ^dimj to the current sute of the process. Modern technology in the form of 

must alter then -am ® inteHioence (AD has o pfnf»t the door for the development of control systems 

high-speed “ adantive control used by humans, and perform more efficiently and with more flexibility 

£!. JventiSTntrol systems. Two powerful tools ^j^e ^1 tto have emerged from Urn field of 
AI are fuzzy logic (Zadeh, 1973) and genetic algonthms (GAs) (Goldberg, 1989). 

Th, U S Bureau of Mines has developed an approach to the design of adaptive control systems, based on GAs and 
FLCsf that is effective in problem environments with rapidly changing dynanucs. Additional*, the ^ritmg 
SSill^ include a mechanism for handling inadequate feedback about tne state or condition ot the probtan 
environment. Sd. controllers are more suitable than past control systems for recognizing, quantifying, and 
adapting to changes in the problem environment. 

_ . . evstems developed at the Bureau of Mines consist of a control element to manipulate the 

The adaptive comrol y to recognize changes in the problem environment, and a learning element 


506 




is described in this paper. A particular problem environment, a computer-simulated chemical system, serves as a 
forum for presenting the details of an adaptive controller being developed by the Bureau. Preliminary results are 
presented to demonstrate the effectiveness of a GA-based FLC for each of the three individual elements. 


PROBLEM ENVIRONMENT 

In this section, a computer-simulated chemical system is introduced to serve as a forum for presenting the details 
of a siaH-alonf, comprehensive, adaptive controller being developed at the U.S. Bureau of Mines; emphasis is on 
the method not the application. The chemical system consists of a continuous stirred tank reactor in which ammonia 
and formaldehyde are mixed to produce hexamine and water. Since the reaction is exothermic, a heat exchanger 
is included to limit the temperature in the reactor. A schematic of the physical system is shown in Figure 1. 




Figure 1.— A schematic of the hexamine system. 

A model used in this research employs the approach described by Kennode and Stevens (1965). 

Specifically, the system is modelled with the following set of equations: 

Energy Balance 

q A pC r T M * qrPCfr - {q A ♦ 9r)P C , r «* + r{-EH)V - UAHT m - Vp C f ^ 

Mass Balances 

- (9, * 9,)C„ -rV.V^ t 



Heat of reaction 


qfC n * (9a * 9 r )C r - \3rV - V-f 
-A H - 16610 ♦ 121 ( 7 ^ - 293 . 2 ) 


507 


Rate of reaction 


r - kC A C * 

where q represents the volumetric flow rates (1/s), C is the concentration (moles/l), r is the rate of resction (moles 
of ammonias 1 s), V is the volume of the reactor (tank) (1), T is temperature (°K), AH is the heat of reaction (cal/gm 
mole) U is the heat transfer coefficient (cal/cm 1 °K s), AT, is the mean temperature difference for heat transfer in 
the teat exchanger (and is a (unction of the volumetric flow rate of water through the teat exchanger, qj, and k 
is the rate of reaction constant (F/mole* s) given by: 

-3090 

k - 1420 *e Tmk 

and the subscripts (A and F) indicate the ammonia and formaldehyde whereas the subscript i represents material 
entering the reactor. The assumptions associated with this model include perfect mixing in the reactor, no heat 
lo ss e s, all physical properties the same as water, and a third-order, irreversible reaction. 

A reactor having a volume of 92.4 1 was simulated. The inflows of ammonia and formaldehyde, respectively, were 
allowed to reach maximum values of 1.885 1/s, while the maximum flow rate of the teat exchanger was 1.2 1/s. 
The objective of the control problem is two-.old: (1) to develop a FLC capable of maintaining a desired reactor 
temperature in response to changes in the flow rate of formaldehyde and (2) to maximize the production of hexamine 
while the waste in the amount of reactants used. The amount of water produced was deemed 

inconsequential to the control strategy. In this research, the desired reactor temperature is 315.0 «K. Furthermore, 
a constraint is placed on the amount the valves controlling the inflow of ammonia can be opened or closed during 
a given time step. The maximum rate at which the flow of ammonia can be changed is 0.1885 1/s/s. This 
c o n tain* is enforced to limit transients in the system. 

The system, as it has been described to this point, provides a challenging control problem, due mainly 

to the nonlinearity present in the rate of reaction. It is a non-trivial task to maintain the temperature in the reactor 
for various forcing functions (as defined by the rate at which the formaldehyde enters the reactor), much less to 
ensure the process proceeds efficiently (maximum hexamine production with minimal waste in ammonia and 
formaldehyde) However, yet another complication is now introduced: the concentration of the reactants (the 

„ and the formaldehyde) can be altered randomly. Furthermore, there is no mechanism in place for 

providing the controller with feedback concerning the nature of these changes. Thus, an efficient control system 
must be able to recognize when the hexamine system has been altered (when the concentration of the reactants are 
changed), it must be able to determine the new values of the concentrations, and it must be able to alter its control 
strategy in response to the changes; an adaptive control system is needed. 


STRUCTURE OF THE ADAPTIVE CONTROLLER 

Figure 2 shows a schematic of the Bureau’s adaptive control system. The heart of this control system is the loop 
consisting of the control element and the problem environment. The control element receives information from 
sensors in the problem environment concerning the status of the condition variables, i.e., q* q,, q., and T — . It 
then computes a desirable state for a set of action variables, i.e., flow rate of ammonia (qj and flow rate of water 
through the teat exchanger (qj. These changes in the action variables force the problem environment toward the 
setpoint (T„* - 315.0“ K. This is the basic approach adopted for the design of virtually any closed loop control 
system, and in and of itself includes no mechanism for adaptive control. 


508 



Figure 2. -Structure of the adaptive control system. 


The adaptive capabilities of the system shown in Fig. 2 are due to the analysis and leamingdements. In general, 
the analysis eluent must recognize when a change in the problem environment has occurred. A change, as* 
is used here f on««»« of a change to the concentration of either of the reactants. The analysis element uses 
information Concerning the condition and action variables over some finite time period to recognize changes in the 
environment and to compute the new performance characteristics associated with these changes. 

The new environment (the problem environment with the altered parameters) can pose muiy difficulties for the 
control element because the control element is no longer manipulating the environment for which it was designed. 
Therefore the algorithm that drives the control element must be altered. As shown in the schematic of Fig. 2, this 

^is accomplished by the learning element. The most efficient approach for tiie learning element to use to alter 
the control element is to utilize information concerning the past performance of the control system. The strategy 
I^d bytbe control, analysis, and learning elements of the stand-alone, comprehensive adaptive controller being 
developed by the U.S. Bureau of Mines is provided in the following sections. 


Control Element 

The control element receives feedback from the hexamine system, and based on the current state of q^ %, q». «nd 
must prescribe appropriate values of q A and q.. Any of a number of closed-loop contronera could 
forthis element" However, because of the flexibility needed in the control system as a whole, a FLC is employed. 
Like conventional rule-based systems, FLCs use a set of production roles which are of the form: 

IF { condition } THEN {action} 

to arrive at appropriate control actions. The left-lumd-s.de of the roles (the condition side) consists of combinations 
ofX controlled* variables (q A , q F . q., and T_»); the right-hand-side of the roles (the action side) consists of 
combinations of the numipulated variables (q* and qj. Unlike conventional expert systems, FLCs ure ni^tiun 
utilize fuzzy terms like those appearing in human rules-of-thumb. For example, a valid role for a FLC used to 
mani pulate the hexamine system is: 


509 






IF {cu isVH and q F isVLandq»isLand T,.* is VH} 
THEN (q* is NB and q w is PB}. 


The fuzzy terms are subjective; they me*, different thing, to different 'experts^ thmgs in 

varying sitm t* 1 ™ 18 Fuzzy terms are assigned concrete meaning via fuzzy membership functions (Zadeh, I )• 


n,. .nemi-rshio functions used in the control element to describe ammonia flow rate appear in Fig. 3. (As will 
s^T^or^^tbehaniing element is caprfde of changing these membership functions in i response to changes m 
. nmWen) environment ) These membership functions are used in conjunction with the rule set to prescribe 
STS 3S3S JTI-W- HU - conventioiml easterns. FL* -flow for the 

of more than one rule at any given time. The single crisp action ,s unng a wetghted avmgtng 

trrhnimir that incorporates both a min-max operator and the center-of-area method (Karr, 1991). The ,©«i.w*3g 
and therefore "defined" with membership functions, to describe the significant variables in 

the hexamine system: 


<1f 

q« 

T 

«u 

flA 


Very Low (VL), Low (L), Medium (M), High (H), Very High (VH) 

Low (L). Medium (M). High (H), Very High (VH) 

Low (L), Medium (M), High (H) 

Very Low (VL), Low (L), Medium (M), High (H), Very High (VH) 

Negative Big (NB), Negative Medium (NM), Negative Small (NS), Zero (Z), 
Positive Small (PS), Positive Medium (PM), Positive Big (PB) 

Negative Big (NB), Negative Medium (NM), Negative Small (NS), Zero (Z), 
Positive Small (PS), Positive Medium (PM), Positive Big (PB). 



Figure 3.-Fuzzy membership functions for the flow rate of ammonia. 


An effective FLC for mandating the hewmine system cm be written that conUms 300 ndes, tffoe rmutom 
changes to the concentrations of the reactants am neglected. The 300 rules are necessary bemuse there are five 
fuzzyterms describing T^. five fuzzy terms describing q*. four fuzzy terms descnbmg q,, ^ three fuzzy toms 
A^bine n (5*5*4*3=300 rules to describe all possible combinations that could exist in foe hexamine system 

Ltrol element are certainly inadequate to control the full-scale hexamine system; foe one that includes foe chan ging 
SS! However, "TpSonmmce of a FLC can be dmnutimlly altered by dunging foe membenhip 


510 



functions. This is equivalent to chang in g the definition of the terms used to describe the variables being considered 
by the controller. As will be seen shortly, GAs are powerful tools capable of rapidly locating efficient fuzzy 
functions that allow the controller to accommodate changes in the concentrations of the reactants. 


Analysis Element 

The analysis element recognizes changes in parameters associated with the problem environment not taken into 
account by the rules used in the control element. In the hexamine system, these parameters are die concentration 
of the two reactants. Changes to the concentrations dramatically alter die way in which die hexamine system 
responds to control actions, thus forming a new problem environment requiring an altered control strategy. Recall 
that the FLC used for the control element presented includes none of these parameters in its 300 rules. Therefore, 
some for altering the prescribed actions must be included in the control system. But before the control 

f l rm »nt can be altered, the control system must recognize that the problem environment has changed, and compute 
the nature and magnitude of the changes. 

The analysis element recognizes changes in the system parameters by comparing the response of the system being 
rnnimi iwi to the response of a model of the hexamine system. In general, recognizing changes in the parameters 
associated with the problem environment requires the control system to store information concerning the past 
performance of the problem environment. This information is most effectively acquired through either a data base 
or a computer model. Storing such an extensive data base can be cumbersome and requires extensive computer 
memory. Fortunately, the dynamics of the hexamine system are well understood. In the approach adopted here, 
a «y m i p»'t— • model predicts the response of die hexamine system being controlled. This predicted response is 
compared to the response of the system being controlled. When the two responses differ by a threshold amount over 
a finite period of time, the hexamine system is considered to have been altered. 

When the above approach is adopted, the problem of computing the new system parameters becomes a curve fitting 
problem (Karr, Stanley, and Schemer, 1991). The parameters associated with the computer model produce a 
particular response to changes in the action variables. The parameters must be selected so that the response of the 
model matches the response of the problem environment. 


An analysis element has been forged in which a GA is used to compute the values of the parameters associated with 
the system. When employing a GA in a search problem, there are basically two decisions that must be 

made: (1) how to code the parameters as bit strings and (2) how to evaluate the merit of each string (the fitness 
function p— be defined). The GA used in the analysis element employs concatenated, mapped, unsigned binary 
coding (Karr and Gentry, 1992). The bit-strings produced by this coding strategy were of length 16: the tint +* 
bits of the strings were used to represent the concentration of the ammonia and the second 8 bits were used to 
represent die concentration of the formaldehyde. The 8 bits associated with each individual parameter were read 
as a binary number, converted to decimal numbers (000 = 0, 001 = 1, 010 = 2, Oil = 3, etc.,), and mapped 
between minimum and maximum values according to the following: 


C * 


b 

( 2 - - 1 ) 


< c -« 


-C^ 


(7) 


where C is the value of the in question, b is the binary value, m is the number of bits used to represent 

the <«*• (8), and C_j. and are minimum and maximum values associated with each parameter 

that is being coded. 

A fitness function has been employed that represents die quality of each bit-string; it provides a quantitative 
evaluation of how accurately the response of a model using the new model parameters matches the response of the 
system being controlled. The fitness function used in this application is: 


511 



l-500t 

/- Ef-U-W 


(*) 


WiU. this definition of the fitness function, the problem becomes « minimization problen,:the GA mu* mimnmte 
f* which as .t has been defined, represents the difference between the response predicted by the model and the 

response of the system being controlled. 



Figure 4. -A GA is able to compute the concentrations of the reactants. 


Figure 4 demonstrates the ability of a GA to select the appropriate parameters associated with the problem 
r^ronment A GA is able to reduce the difference between the response of the hexamine system being controlled, 

evaluatio^Tc^ new parameters (and thus the new response characteristics of the problem environment) have been 
determined, the adaptive element must alter the control element. 


Teaming Element 

The learning element alters the control element in response to changes in the problem environment. U does ; roby 
altering the membership factions employed by the FLC of the control element. Smcenone of the ranoomly steed 
narametms appear in the FLC rule set, the only way to account for these conditions (outside of completely 
S^tem) is to alter the membership functions employed by the FLC. These alterations consist of 
changing both the position and location of the trapezoids used to define the fuzzy terms. 

Altering the membership functions (the definition of the fuzzy terms in the rule set) is consistent ^th the way 
U . f cnmnlex svstems Quite often, the rules-of-thumb humans use to manipulate a problem environment 

srr2.*3S sssltw. „ ». *^^****.*+. ~ 

applied are altereth This is basically the approach that is being taken when the fuzzy membership functions are 
altered. 

The U S Bureau of Mines uses a GA to alter the membership functions associated with FLCs, and this technique 
has been well documented (Karr, 1991). A learning element that utilizes a GA to locate high-efficiency membership 
functions for the dynamic hexamine system has been designed and implemented. 


512 




The performance of a control system that uses a GA to alter the membership functions of its control element is 
demonstrated for the situation in which the concentrations of both reactants am altered. Figure 5 c ompar e * the 
performance of the adaptive control system (one that changes its membership functions in response to e in 
the system parameters) to a non-adaptive control system (one that ignores the changes in the system parameters). 
In this figure, Che concentrations of both reactants have been altered S10 seconds into the simulation In this case, 
not only is the adaptive controller able to better maintain the desired tank temperature, but it also prescribes control 
actions that allow for the production of mote hexamine. 



Figure 5.— The adaptive controller is much more efficient. 


SUMMARY 

Scientists at the U.S. Bureau of Mines have developed an Al-based strategy for adaptive process control. This 
strategy uses GAs to fashion three components necessary for a robust, comprehensive adaptive process control 
system: (1) a control element to manipulate the problem environment, (2) an analysis element to recognize changes 
in the problem environment, and (3) s learning element to adjust to changes in the problem environment. The 
application of this strategy to s computer-simulated hexamine system has been described. 


REFERENCES 

Fogler, H. S. (1986). Elements of Chemical Reaction Engineering. Prentice-Hall, Englewood Cliffa, NJ. 

Goldberg, D. E. (1989). Genetic Algorithms in Search, Optimization, and Machine Learning. Addison-Wesley, 
Reading, MA. 

Gottinger, W. W. (1991). Economic Models and Applications of Solid Waste Manr-’ement. Gordon and Breach 
Science Publishers, New York, NY. 

Karr, C. L. (1991). Genetic algorithms for fuzzy logic controllers. AI Expert, 6, 26-33. 

Karr, C. L-, Stanley, D. A., and Schemer, B. J. (1991). A Genetic Algorithm Applied to Least Squares Curve 
Fitting. U.S. Bureau of Mines Report of Investigations No. 9339. 



Kelly, E. G. and Spottiswood, D. J. (1982). Introduction to Mineral Processing. John Wiley & Sons, New York, 
NY. 

Kennode R. I., and Stevens, W. F. (1965). Experimental verification of the mathematical model for a continuous 
stined-tank reactor. 7 he Canadian Journal of Chemical Engineering, April, 68-72. 

7tAf *. , * ( 1973 ). Outline of a new approach to the analysis of complex systems and decision processes. IEEE 

££££ SMSii 2IM4- 



/0./S67 



N93-29573 

A Genetic Algorithms Approach for Altering the Membership 
Functions in Fuzzy Logic Controllers 


Hana Shehadeh 
LinCom Corporation 
Houston, TX 77058 
(713)488-5700 


Robert N. Lea 

NASA/Johnson Space Center - FM7 
Houston, TX 77058 
(713) 483-8015 


Abstract 

1 W a t^y c««»l ^ 

3 ? - 

offers a technique fortuning fuzzy logic controllers. 

, . . n ,u.hased system that uses fuzzy linguistic, variables to model human rule-of- 

£ features rules that direct 

values used for system control [7]. 

^ _ . , ILL r mhrnthin functions is the most rime consuming aspect of the controller design. One 

Defoung the fi^ memb aship 1 8 |d significantly alter the performance of the controller. This 

smgie^ge m the ^ «, ate the membership 

Sn^cJSThSly tuned cwtroller. Thb^proach can be time consuming and requires a great deal of 

knowledge from human experts. 

In order to shorten development time, an iterative procedure for altering the «eatea 

^STSh^taiq* S5 ^SLation, was the method utilized to solve this 

problem. 

1. PREVIOUS WORK 

«nr nimmated rendezvous ooerations for future space missions. 


1.1 SCENARIO 

C0ns ^ nt mso^^ihc stationary target The velocity vector and radial vector approaches 

posifio&s o rmtm i ak»ff with m aintai ning the elevation and azimuth angles at zero. The station-keeping 

«*»“? range Srernte near zero, and dm azimuth and elevation angles 

vector to the negative radial vector tetpiirwmamtanmg 
^ST’nSTe e y ie^S!S^in.»«h angles. In figure ID. the elevaion and azimuth angles are being 
maintained at zero. 


515 



. • <nnrfw>rh , * >g Annie approaches from 400 fee t to 50 feet the 

F.gure IB shows the velocity ^ fl , Jd 0 f view of tbc Crew Opuczl 

target on tbe velocity vector, , » Rnue 1C shows the radial vector approach, where 

Alignment Sighting 55?^ ^^telowttr^ target® 50 feet below the target on the radial vector. Here again. 

Stott «d»l ™ * P«*oo »* — «-« 

the desired radial vector position. 



2. CURRENT WORK 

■*- ~ i *- * *• 

translational hand controller [3]. 

increase in the payload capacity results. 

w. ■*.. - *.■ 'rZZSSSl SS^SS^-*5-£S? 

su^egy. Defining itie significantly allcrcd ihe performance of ihc controller This 


516 



the membership functions in aider to minimize fiicl consumption. Figaro 2 shows the final iange parameters 
memhewhip f»«nfrinn« fcnm the manually-tuned controller after e»«enih«fiee-tnniny. 





Figure 2: Range Paraneters Membership Functions Set 
Membership Functions Constraints 

A chromosome string which consists of 38 points defined the range and range rale membership functions. The 
fimrc« function regulated the membership functions points to float along die universe of discourse within 
certain constraints. Figure 3 shows the labeled points. An example of the constraints algorithm placed on the 
individual points ate as follows: The positive luge (PL) vertex for range, labeled 3, is constrained to a value 
between the vertex of the positive medium membership function and the maximum value of the universe of 
discourse. The positive medium (PM) vertex for range, labeled 2, is constrained to a value between the vertex of 
the positive small (PS) membership function and the maximum value of be universe of discourse. The points 
labeled 4, 5, and 6 also follow this algorithm. The right leg of the positive medium membership fraction, 
labeled 13, is constrained to a value between the vertex of the positive m edium m em ber sh ip function and the 
mininwwn value of the universe of discourse. The leg of the positive large membership fraction, labeled 12. is 
constrained to a value between the vertex of the zero membership function and the vertex of the positive large 
membership function. The points labeled 9 , 10. 14, 15, 10. 17, and It also follow this algorithm. The 
points labeled 0.7, 8. are fixed at zero and ate not allowed to float The membe r s h ip fractions for the range rate 
parameters ate symmetric and follow the sane algtritfcm for this approach. 


517 



37 36 35 34 33 2627 2829 3031 32 

Range Rate 


Figure 3: Selected Points 


rarj 

mcmbe ^^ 

S^Tto fudronsumption ). Each of these chromosomes which is 

function twameters. Each chromosome is sent to the Orbital Operation Simulator (OOS) 
ffil fdiscnssed in greater detail in this paper), where simulated runs on the velocity vector *®d staooo- 
L JLp performed. The bit strings which represent the parameters of the 

m^^tfrataacoe fa fitness-function value), that is a non-negative measure of relative worth, representing tte 
£S^l!T£y SSS*e goal of defining the high-performance fuzzy controller. ^control 
f yl «w< for both the approach maneuver and station keeping, were used to assign this score ias 

» m* ■»» *** r*®* b * the « inat,on « a 

nrrsKahiiitv for selection during the representation phase. 


F itness “ (i + ( Appro^F^ * Appp^TW + ( s tation K refuel * s tation K eep T ime)) (l) 

Those oarameters that were given the higher probability were placed in the new genenuion. a cro^over and 

^ the new chromosomes and the process was re-iterated unnl a good solution 

was found. 




Total Parameters = 38 String Lengths 190 Population Size = 50 



Figure 4: Flow Chart of the Selection Process 
3. SPECIALIZED SOFTWARE TOOLS USED 


Splicer 

Splicer [1,2], a genetic algorithm tool designed for developing code for evaluating chromosomes was used in 
this project The objectives of this approach were to evaluate the capabilities of genetic algorithms for the 
widespread use in automating trie fine tuning of the fuzzy logic memberehip functions. 

If this type of approach would be applicable in a variety of domains: e.g., robot path planning and 

job shop scheduling. Splicer is a flexible, generic tool that allows for 

• Implementing the basic genetic algorithms defined in the literature 

• Defining the interfaces for and allowing users to develop 
interchangeable fitness modules 

• Providing a graphic, event-driven user interface. 


Splicer consists of a genetic algorithm kernel that comprises all functions necessary for the manipulation of 
populations including, the creation of populations and the population members, fitness scaling, and random 
number generation. It also provides rep re sen t a tion libraries for binary strings and for permutations. The fitness 
modules are the only component of the Splicer system a user will be required to create or alter to solve a 
particular problem. Within a fitness routine a user can create a fitness (scoring) function and set the initial 
values for the control parameters. Splicer is available in X-Windows and Macintosh versions, as well as a 
generic C language command line version [1,2]. 

Orbital Operations Simulator (OOS) 

For testing the 6-DOF controller, NASA’s OOS was used with its graphics interlace to the Iris workstation. 
The OOS is a high fidelity, multi-vehicle spacecraft operations simulation that provides 6-DOF equations of 
motion within an orbital environment including gravity gradient and aerodynamic drag. 

The OOS has a high fidelity Space Shuttle model with the fuzzy 6-DOF controller and the required orbital 
environment math models. The OOS also has the capability of simulating mission timelines according to crew 


519 












procedures. The 6-DOF fuzzy logic shuttle controller was implemented in the OOS environment and detailed 
simulation testing was used to evaluate its performance. 

4. RESULTS 

Table 1 shows the results of the two designs. A comparison was made between the outputs from llte pUoted 
c-j pmiaiinn nuis, the manually tuned controller outputs and the automatically fine-tuned controller outputs. 

» , «. he seen the oerfor^ance of the fuzzy controller compares quite well to the piloted results. The manually 

maneuver by .001 lbs/scc, and for station keep by .0133 Ibs/sec. 

Membership functions automatically tuned by the genetic algorithm produced results 
aSSwfthT manually tuned fuzzy controller. The Fuzzy Genetic Algorithm controller used .002 Ibs/sec 
morc fuel tot the manually tuned fuzzy controller but used .001 lbs/sec less than the fuel ropnred tea pdottd 
aSStSSwh maneuver. For station keep the Fuzzy Genetic Algorithm controUer used^OWlbs/s^ 
less fuel than thepiloted results. However, the Fuzzy Geneuc Algorithm used .0093 Ibs/sec more fuel than the 
^Manuall y fuzzy controller. 

Tiv. a wm -soils from the aenetic algorithm tuning approach are promising. It is anticipate that through future 
worMmproved performance of the controUer can be achieved by allowing the height of the vertex points to 
float as weU as the positions in the domain. 


Piloted Results 


vbar approach 
Station Keep 


400-50 

@200 


.022 (Hated 
02 flhs/sed 


Manually Tuned Fuzzy Controller 


v bar approach 400-50 

Station Keep @200 


Automatically Tuned Fuzzy Controller 


v bar approach 400-50 

Station Keep @200 



.023 mated 
.0067 Abated 


.021 (Hated 
.016 mated 


Table 1: Results Summary Table 

Tiw. mnh in fimire 5 shows the relationship between the amount of fuel used versus the number of generations. 
M (St ^v^im tSd by dm algorithm. The concern h» w» that the genetic al^nthm 
ZSSSi Results that compared quite well to the manual were achieved, yet convergence to a mmtmal 
amount of fuel was not achieved. 









5. ISSUES/CONCERNS 

The problem domain parameters defined for this approach are. Sampling Operator (TOURNAMENT). 
Population Size (50), and Mutation Probability (0.001). 

A »™«m« w»nt type pwnniing operator is used to sample members of the population for m ating. Samplin g uses 
tarset wnp»ng rates (generated by selection) to create a mating pool of memberafrom the curre nt populati on lof 
Kntmb^aTbecSra for mating multiple times or not at all, accenting ® then target samphng 
The mutation probability operator was set to .001(1,2]. 

These genetic operators were applied to the members of the population, and their strings, while the genetic 
algorithms were running. 

a feasibility study, the population size of fifty was chosen to fine-tune the memboship functions. 
Conridertagthe S& EESZom string, a larger population may have ta. 
rnmnimitv o* the evaluation process resulted in an evaluation ume of several minutes per population me mber. 
complexity o » ' time anooulation size greater than fifty was prohibitive. Now that feasibility has been 

ca ses* 

For the test cases performed in this paper, the orbite* starting position was at 400 feet from the target on the 
velocity vector It is the test cases used for evaluation may have exercised only a pomon of the control 

S3? SSi Ewhe^ the starting position of the orbiter is randomly initialized would have gtven a 
more accurate evaluation of the genetic algorithm’s effectiveness. 

Ftnaitv the Arbiter's starting position was always 400 feet from the target on the velocity vector, oi ly three out 

meEship functions NL, NM, PM, and PL were never used. Having random sarong positions of the orbiter 
would then require the use of all seven of the membership functions. 

It is interesting to note that all three approaches (piloted, manually tuned, and ftay genetic 

«!nufn*snlis for the velocity vector approach (~ 0.02 lbs/sec). A possible reason for this is that the 

conSE is also controlled with breaking gates. When approaching a tfS 

speed limit which is a function of the distance tothe target Theta grange '*£"*« 2CS5b5 

ooim" and are shown in figure 7. Outside of 400 feet the approach speed is 0.4 ft/sec. At 300 feet the atiowame 

. , .irnnc mfll ii/wr ibe 0 2 ft/sec rate is maintained from 200 feet to approach termination. Since 

approach rate drops to 0.3i Met ■ The OZ twee m^he the minimal feel usage 



522 


6. SUMMARY 


We have the use of genetic algorithms to automate the fine tuning of fuay logic me mfcen hip 

functions for a spacecraft proximity operations controller. The complexity of the problem and 
SSScL iiESS Se genetic Algorithm population member evaluations did place some constrains on 
our i«"r u,w "* glio " oi *e problem. However, a solution comparable to highly Pained 
toemnedrontroller was obtained in a reasonable amount of time. Genetic algorithms show a viable potential 

for the automatic fiwMuning of fuzzy logic based control systems. 

Acknowledgments 

The authors would like to thank Steven Bayer of MURE Corporation. Jeff Hoblit of LinCom Corporation. 
5SS2? Jaititf ' Topi. JmnesVillarreal andLui Wang of National Aeronaut and Space Administration - 
Lyndon's. Johnson Spice Center /PT4 for their technical support and analysis help for this pro . 

References 

1 . 


2 . 

3. 

4. 

5. 

6 . 

7 . 


Baver Steven E: SPLICER A Genetic Algorithm Tool For Search eutd Optimiiation^rManual . 
A Product of the Software Technology Branch Lyndon B. Johnson Space Center, v.1,1991. 


Bayer, 

Manual 


■ Steven E.: SPLICER A Genetic Algorithm Tod For Search and Optimization Reference 
al . A Product of the Software Technology Branch Lyndon B. Johnson Space Center, v.1,1991. 

Lea, Robert N.: Automated Space Vehicle Control for Rendezvous Proximity Operations . Telematics 
and Informatics, vol 5, no 3,1988. 

Lea Robert N Jani Yashvant , Hoblit, Jeff: A Fuzzy Logic Based Spacecraft Controller for Six 

p» M di»g,orAI*A'91.he l d«lfc.OH H «. 

Rawlins. Gregory JJE.: Foundations of Genetic Algorithms . Morgan Kauftnaim Publishers. Inc. 1991. 

Edwards, H.C.; and Bailey. R.: The Orbital Operations Simulator Users Guide. LinCom Corporation, 
refJLM85-1001-01, June 87. 

Karr, Or * Genetic Algorithms for Fuzzy Controllers . AI Expert, March 1991. 


UNCLAS 





Fuzzy Multiple Linear Regression - A Computational Approach 


by 

C. H. Juang’, X. H. Huang, and J. W. Fleming 
Department of Civil Engineering 
Clemson University 
Clemson, SC 29634-0911 

Telephone number: (803) 656-3322 
Fax number: (803) 656-2670 





A paper prepared for presentation and publication 

in 

The NAFIPS ’92 
Puerto Vallarta, Mexico 
December 15-17, 1992 


f 

! *The nresenter an d to whom cnrresnondence be addressed. 




INTRODUCTION 


Conventional regression analysis is a statistical tool for describing relationships between variables. 

! If a large and representative data set is available, a ’good” relation might be established using an 

appropriate model. If the statistical properties such as the coefficient of determination (R 5 ) meet certain 
criteria of "good" fitting, the relation obtained from the regression analysis may then be used for ’making 
predictions.’ The regression technique is, indeed, a very useful tool for solving many engineering 
i problems. However, there are situations where use of the conventional regression analysis is not feasible. 

i For example, when data are imprecise, as is usually the case in many geotechnical engineering problems 

such as predicting the conductivity of clay liner, the conventional regression analysis is not applicable 
(Bardossy, et a!., 1987, 1989). Another example concerns rules of thumb often used in engineering 
practice. These rules of thumb are, in loose sense, relationships between linguistic variables. 


Fuzzy regression was perhaps first introduced by Tanaka et al (1982). Fuzzy regression analysis, 
as the name implies, uses the tools of fuzzy set theory to analyze fuzzy variables. Bardossy et al. (1987) 
extended fuzzy linear regression method by Tanaka et al. (1982) into nonlinear cases. In contrast to the 
statistical least-squares criterion, a fuzzy criterion based on a ’vagueness” measure for the goodness of 
the regression was used in their approach. While this approach has been applied to solving many 
engineering problems, some questions remain to be answered. Among them are questions regarding 
uniqueness of the fitting, selection of the vagueness criteria, and the interpretation of fuzzy regression. 


\ This paper presents a new computational approach for performing fuzzy regression. In contrast 
to Bfirdossy’s approach (1989), the new aporoach, while dealing with fuzzy variables, follows closely die 
conventional regression technique. In this approach, treatment of fuzzy input is more "computational’ 
than "symbolic." The following sections first outline the formulation of the new approach, then detail 
the implementation and computational scheme, followed by examples to illustrate the new procedure. 

FUZZY MULTIPLE LINEAR REGRESSION - A FRAMEWORK 



Suppose that a set of limited number of observations, (y. x„ xj's, is to be used to 
determine a relationship. If all variables are non-fozzy, the conventional multiple linear regression 
involves fitting to the given data the following equation: 

y = ao + a, x, + ... + a B x I (1) 


where a„, a,, ..., a, are the coefficients that minimize the sum of the squares of the residuals. These 
coefficients may be determined by solving the following equation: 


a £ x i i ••• £ x *i 


w 


’ £* ' 

£ X li £ x Xi • • • £ x U x ai 


*1 

• 

S 

£^iiVi 

. £ x mi £ X al X li ' ' * £ X mi 




k £ x »i*'i, 


( 2 ) 


525 








The coefficient of determination (R 3 ), a handy measure of goodness-of-fit (but not an absolute indicator), 
is defined as follows: 



R 3 = (S, - S r ) / S, 

(3) 

where 


S, = £ ly, - (Ey^/nl 3 

(4) 

and 


S t = E Cyi - (ao + a, x, + ... + a, xj) 3 

(5) 


In the above equations, ail the summation is performed for i from 1 to n. Equations 1 through 5 define 
the conventional linear regression based on the least-squares criterion. These equations operate on non- 
fuzzy data. As such, interpretation of results of a regression analysis is straightforward. 

Now, suppose all the given data are fuzzy numbers. In order to follow the above least-squares 
approach, new mathematical operations must be defined for processing these fuzzy numbers. Although 
fuzzy arithmetics (Kaufmann and Gupta, 1985) such as addition, subtraction, multiplication and division 
of fiizzy numbers along with many other operations have been introduced, the efforts required to directly 
implement the above regression analysis by fuzzy arithmetics would be overwhelming. It appears that 
a simpler approach is warranted. 

In the present study, the JHE method (Juang, et al., 1991) is adopted to create a new procedure 
for performing regression analysis of fuzzy data. In the JHE method, fiizzy numbers are often 
characterized by beta-M membership function, f(z), defined below (after Juang, et al., 1992): 



m 

= C (z-b)“ (d-z)'. 

(6) 

where 


c 

= {a“ fi* [(d-b)/(a+R)r‘}-\ 

(7) 


a 

= p 3 (l-p)/q 2 - (1 +p). 

(8) 

and 


R 

= (a+l)/p - (a+2). 

(9) 

and where 


P 

= (/t-b)/(d-b), 

GO) 

and 


q 

= a/(d-b). 

(11) 


Notice that the parameters b, d, n, and a in the above equations are the minimum, maximum, mean, and 
standard deviation of die variable z. The parameters a and 8 are positive real numbers. The beta-M 
function is essentially a beta probability density function normalized with respect to its maximum 
functional value such that its maximum functional value at the mode is 1 .0. It is a bounded function and 
satisfies the conditions for a fuzzy number (i.e., normal and convex fuzzy subset). The beta-M function 
can be symmetric, skewed to right, or skewed to left in shape, and is suitable for representing various 
engineering parameters with ambiguity. 

The regression analysis involving equations 1 through 5 is basically a deterministic model. In 
a deterministic model, if the input is fuzzy numbers, the output will also be fuzzy numbers. For the 
problem at hand, the coefficients a,,, a„ ..., a,, and R 3 obtained from regression analysis will be fuzzy 
numbers. Thus, the predicted value, y, obtained from Eq. 1 for a given x-vector (x„ x 2 , .... xj will be 
a fuzzy number. Since the data are imprecise, the "goodness-of-fit” may be measured by some "fuzzy 



526 





rSon £$£! ^ 3,1 aM,0g0US ^ ° f ^ of dcter "°“ **«» « the convention* 

1 c k 1 A 111,8 Stud U 1116 appr . oa £ h for performing regression analysis of ftizzy data is illustrated in Figure 

1 throuriffn * tTS? T P r ‘I f ' r - t " de f lzzif, “ r **“8 processed by regression equations (E^s. 
1 d ‘ rough5 k ) ; T h *i Ionte Carl ° s,mulat,on technique is used to select a non-fuzzy, random valuefora 
fuzzy variable based on its membership function. Having de-fuzzified fuzzy numbers into non-fuzzv 

V^»«, a «* of coafficiems including a. a and R> caa b. S 

regression analysis. After a large number of sets of the coefficients are obtained, fuzzy numbers 
[spr«eme^ tetow. C ° e ^ C,entS ^ " re ' constnicted " procedure to implement this approach 

PROCEDURE FOR FUZZY MULTIPLE LINEAR REGRESSION 

m „ h . 1 ^!P/ OPOS S Proeedui e for performing a fuzzy multiple linear regression is based on the JHE 
method. This procedure is detailed in five steps as follows: 

eaCh i" 1 ™!. fi ‘ ZZ)r ■ <la,a (membwshi P faction), determine its cumulative faction by 

? e mmm r taai °» al cumulative Inactions in this step 

Repeat this step for all input fuzzy variables. ^ 

.3*“^ Begin i the simulation by generating a uniform random number. Then normalize the 
generated ramdom number with respect to the maximum functional value of the corresponding cumulative 
functions obtained in step #1, followed by equating the normalized random value to the cumulative 
function, anon-fuzzy value for each input membership function can be back-calculated. This step de- 
tuzzifies all input fuzzy data into non-fuzzy data. ^ 


Sl§El. Perform the conventional multiple linear regression described in Eqs. 1 through 5 This 
compmadon'" * ** ° f COeffidentS ’ inc,uding a »> a - -• R 2 - This complete one iteration of the 

„ . . 5t«R4- Rf ^ Steps 2 and 3 a large number of times. The number of repetitions or simulations 
needed for a satisfactory result may be estimated by a trial-and-error procedure. 

_ StSILl. Determine the minimum, maximum, mean, and standard deviation of each of the 
regression coefficients based on the values obtained from Steps 3 and 4. For each of these coefficients 
die four parameters (b.d^and a) are used to define the betal membership fun«ion (£^ 6 torn& U) 

fi,nctlons ** define ** wamed ftJ2zy numbers that represent 

INTERPRETATION OF FUZZY MULTIPLE LINEAR REGRESSION 

Fuzzy multiple linear regression may be interpreted just as we would in the case of the 
conventional multiple linear regression. For a given vector of fuzzy numbers (x x *_) the 
corresponding value of the dependent variable y can be predicted with Eq. 1 . Although the predicted 
value will be a fuzzy number rather than a crisp number, the principle and the procedure are no 

t *J eir wel| - es | ab ! ished counterparts of the conventional regression analysis. Fuzzy output 
reflects the uncertainty mostly in the input in this case. 3 v 


527 




• "? 




interpretation of the here, as 'toy coefficient of 

mination (R*) in the conventional regression « «“■ . J describ ing the goodness-ot- 

Son" (FCD) in the toy membership grade, by a mappmg 

sikess rr " 166 ^ ” 

a proper linguistic grade. 
rtERICAJL EXAMPLES 

^ - This example is to perform a muljle <Q>^ 

Sff. “Lu* of w^tows -®£ f manner described by the 
uned to be related to pipe diameter (D) siope 
owing equation: 

Q = 3oL r 5 


Talcing the logarithm of this equation yields 

log Q * log a, + a, log D + a, log S 

T , b u i yields the following results: 
Fitting this equation to the data shown in Table l y«ems 


(1j) 


528 








(14) 


a„ = 1.746, a,= 2.616, % = 0.536, andR 1 = 0.999 

The solution presented above was obtained using a computer program called FMLR (Fuzzy Multiple 
Linear Regression). The program FMLR implements the procedure and equations for performing fuzzy 
muSlelSTegressio^prloud carta. When. nontax, <Uase.hinpj.je 
like one which performs the conventional multiple linear regression. When die data is non-fiiray, the 
rdatShT^ned from regression analysis is non-fuzzy, as reflected in this example Equation 12 
(with the ^efficients determined through a regression analysis) is a form of Hazen-Williams equation 
commonly used in civil and mechanical engineering. 


Table 1 Non-Fuzzy Data of Diameter, Slope, and Discharge Rate 


D 

s 

Q 

D 

S 

Q 

D 

S 

Q 

(ft) 

(ft/ft) 

(tf/s) 

(ft) 

(ft/ft) 

(ft 3 /*) 

(ft) 

(ft/ft) 

(tf/s) 

1.0 

0.001 

1.4 

1.0 

0.01 

4.7 

1.0 

0.05 

11.1 

2.0 

0.001 

8.3 

2.0 

0.01 

28.9 

2.0 

0.05 

69.0 

3.0 

0.001 

24.2 

3.0 

0.01 

84.0 

3.0 

0.05 

200.0 


2 - The problem to be solved is the same as the one described in Example 1 except that the 
innut data is fuzzy. The given data is shown in Table 2 where each datum is a .fuzzy number. Each 
fuzzy number here is defined by four parameters b, d, a, and B (Eq. 6). In addition, the mode m (the 
point at which the membership grade is 1 .0) of each fuzzy number is shown. Note that “ 
of the fuzzy number used is a triangular fuzzy number defined by the parameters b, d, and m. Since the 
incut data are fuzzy a fuzzy regression analysis is performed. Results of the fuzzy regression analysis 
X FMlTaSl in Tabi" 3. Each efficient (a* a,, a,, o, R 1 ) is a taay number 
by the four parameters <b, d, a, and 0) of the bw-M taction defined earlier. The mode of the bm-M 
function is also shown as a reference. 


Table 2 Fuzzy Data of Diameter, Slope, and Discharge Rate - Given as logarithms 


log S (ft/ft) log Q (ff/s) 


b 

d 

mode 

b 

d 

mode 

b 

d 

mode 

-0 10 

0.10 

0.00 

-3.30 

-2.70 

-3.00 

0.132 

0.161 

0.146 

0 27 

0.33 

0.30 

-3.30 

-2.70 

-3.00 

0.827 

1.011 

0.919 

0 43 

0.52 

0.48 

-3.30 

-2.70 

-3.00 

1.245 

1.522 

1.384 

-0 10 

0.10 

0.00 

-2.20 

-1.80 

-2.00 

0.605 

0.739 

0.672 

0 27 

0.33 

0.30 

-2.20 

-1.80 

-2.00 

1.315 

1.607 

1.461 

0 43 

0.52 

0.48 

-2.20 

-1.80 

-2.00 

1.732 

2.116 

1.924 

-0 10 

0.10 

0.00 

-1.43 

-1.17 

-1.30 

0.941 

1.149 

1.045 

0 27 

0.33 

0.30 

-1.43 

-1.17 

-1.30 

1.655 

2.023 

1.839 

0.43 

0.52 

0.48 

-1.43 

-1.17 

-1.30 

2.070 

2.530 

2.300 


Note- In this example, the parameters a and 8 for all fuzzy numbers are set to be equal 
to 2.42. According to Juang et al (1992), in this case, these beta-M fuzzy numbers 
take the form of a ir-curve, a bell-shape bounded function. 


529 


I 

i 

< 


Table 3 Results of the Fuzzy Regression Analysis for Example 


For the coefficient a,,: 

b= 1.743, m= 1.746, 

d= 1.749, 

Q 

II 

3 

6= 1.20 

For the coefficient a,: 

b= 2.587, m= 2.616, 

d= 2.643, 

a= 1.26, 

8- 1.22 

For the coefficient a^: 

b= 0.531, m= 0.536, 

d= 0.541, 

a= 1.26, 

8= 1.22 

For the coefficient R 2 (FCD): 
b= 0.999, m= 1 000, 

d= 1.000, 

a= 1.01, 

6= 0.00 


Expnpiql 

• • the basis for deriving a solution often is some rules of thumb 

In many engineering problems, thebasis g ^ ^ require ments for constructing a 

of ,hu,,,b ' 

Symbolically, each of these rules of thumb is expressed as follows. 

IF X, is A t j and X, is and X, is A^ 

THEN YisBj. 

Here X„ X„ an. X, are r^« - * TTasZ^. 

important influence on the possibility o g values of these linguistic variables, A,j, 

z - Fot ““ p,e - * 

rule of thumb may state: 

ip the plasticity index is medium*, and the colloid percentage is high, 
and the swelling potential is low, 

THEN the possibility of meeting the EPA liner requirements is «ry high. 

Now let’s assume a group o, 

rules may be used to establish a P r ^ f ^ | in guistic variables used in the model need to be 

requirements. Tobegmwith, linguistic terms and their corresponding fuzzy numbers used in this 
translated into fuzzy numbers. The ‘ in «^’ c “ ‘ ivcn in Tables 4, 5, and 6, a fuzzy multiple linear 

reSio^^iirformS. The results of this analysis are listed in Table 7. 

INTERPRETATION of results of FUZZY REGRESSION 

fSSU. tha cemroid of » mombmahip »*• ■* *• - " 

represent the fuzzy number. 


530 





One possible approach for interpreting the obtained FCD is to translate the resulting fuzzy number 
into a linguistic grade. A dictionary of linguistic grades for the goodness-of-fit, with Aeir membmhip 
functions predefined (such as the one shown in Table 8), may be used to describe the goodn^ of fit. 
This may be done by calculating and comparing "Euclidean distances between the resulting FCD fuzzy 
number and the predefined fuzzy numbers of the linguistic terms. The Euclidean distance is a measure 
of "similarity" between fuzzy numbers. Thus, the most appropriate translation is the one with the 
smallest distance or highest degree of similarity. A simple model for the Euclidean distant is as follows 
(Zimmermann, 1987): 

dj - yj £ l|»cn><*> - (15) 


where 


d' = distance between the FCD and the predefined fuzzy number j, 
J = membership function that defines the FCD, and 
= membership function that defines the fuzzy number j. 


Table 4 Rules of Thumb for Assessing the Possibility of Meeting the EPA Requirements 

Plasticity index” Colloid Percentage Swelling potential Possibility of 

(PI) (CP) (SP) meeting the 

v EPA requirements 


high 

high 

high 

high 

high 

high 

high 

medium 

high 

medium 

high 

medium 

high 

low 

high 

low 

high 

low 

medium 

high 

medium 

high 

medium 

high 

medium 

medium 

medium 

medium 

medium 

medium 

medium 

low 

medium 

low 

medium 

low 

low 

high 

low 

high 

low 

high 

low 

medium 

low 

medium 

low 

medium 

low 

low 

low 

low 

low 

low 


high 

low 

medium 

medium 

low 

high 

high 

low 

medium 

low 

low 

medium 

high 

very low 

medium 

low 

low 

medium 

high 

medium 

medium 

medium 

low 

very high 

high 

low 

medium 

medium 

low 

very high 

high 

very low 

medium 

low 

low 

medium 

high 

low 

medium 

medium 

low 

high 

high 

low 

medium 

medium 

low 

high 

high 

very low 

medium 

low 

low 

medium 


531 



In Example 2, the degree of fuzziness in the data is small. As a result, the fuzziness in the 
resulting coefficients is small, and interpretation of the "goodaess-of-fit" is easy. In this case, R o 
essentially equal to 1 .0, and the fitting (regression) is rated as 'sxce..ent." In Example .>, the input data 
is fuzzier and the resulting coefficients reflect this fact. The FCD fuzzy number is between the fuzzy 
numbers representing "very good" and "excellent" listed in Table 8. The Euclidean distance between the 
FCD and "very good" is larger than that between the FCD and "excellent." Thus, the fitting (regression) 
is rated as "excellent." 

Another approach to interpret the "goodness-of-fit" is to plot the predicted versus observed values 
of the dependent variable. However, the predictions and observations, both as fuzzy numbers, needed 
to be first "de-fuzzified." In this case, the "center of gravity" approach may be used. 

Table 5 Linguistic Terms and Their Corresponding Fuzzy Numbers - Independent Variables 


(PD b d mode (CP) b d mode (SP) b d mode 

htah 25 40 30 hlih 20 30 25 high 25 40 30 

medium 10 30 20 medium 5 25 15 medium 10 30 20 

l ow 0 15 10 low 0 10 5 low 0 15 10 

Note: In this example, the parameters a and 6 for all beta-M fuzzy numbers are set to be equal to 2^2. 
Other membership functions such as triangular or trapezoidal shape function, if desired, may be used. 

Table 6 Linguistic Terms and Their Corresponding Fuzzy Numbers - Dependent Variable 
~ — Linguistic Grade for "Possibility" 


Fuzzy number 
parameters 


very low 


medium 


very high 


Note- In this example, the parameters a and (1 for all beta-M fuzzy numbers are set to be equal to 2.42. 
Other membership functions such as triangular or trapezoidal shape function, if desired, may be used. 

Once a satisfactory fuzzy relation is established through a regression analysis, it may be used to 
predict the value of the dependent variable for given values of the independent variables. For example, 
the equation obtained in Example 3 for predicting the possibility of meeting the EPA requirements is as 

follows: 

P a ao + a, (PI) + a, (CP) + aj (SP) (,6) 

where P = the possibility of meeting the EPA clay liner requirements, 

PI = the plasticity index, 

CP = the colloid percentage. 




SP = the swelling potential, and 

ao, a„ and a, = the coefficients defined in Table 7. 

With this equation, the possibility of meeting the EPA liner requirements may be estimated for a given 
set of conditions regarding the plasticity index, colloid percentage, and swelling potential of the clay used. 
Since the values of the three independent variables PI, CP, and SP, and the coefficients a», a„ aj, and 
a, are all fuzzy numbers, the evaluation of this equation involves fuzzy computations. However, this can 
easily be done using the JHE method-simply replacing step 3 in the FMLR procedure presented earlier 
with ordinary addition and multiplication (Eq. 16). The result of such computation would yield a fuzzy 
number as the possibility of meeting the EPA requirements. The methods used for interpreting the FCD 
may be employed to interpret this resulting fuzzy number, and the possibility of meeting the EPA liner 
requirements is thus assessed. 

Table 7 Results of the Fuzzy Regression Analysis for Example 3 


For the coefficient a„: 

b= 0.351, m= 0.702, 

d= 0.905, 

a= 1.29, 

8= 0.75 

For the coefficient a,: 

b= -0.0028, m= -0.0025, 

d= -0.0019, 

ot= 0.41, 

6= 0.83 

For the coefficient 

b= 0.010, m= 0.014, 

d= 0.014, 

a= 0.77, 

8= 0.00 

For the coefficient aj: 

b= -0.020, m= -0.020, 

d= -0.016, 

a= 0.00, 

8= 1.20 

For the coefficient R 2 (FCD): 
b= 0.69, m= 0.90, 

d= 0.91, 

a* 1.76, 

8= 0.13 


Table 8 Fuzzy Numbers and Linguistic Grades for Describing Goodness-of-Fit 


Fuzzy number 
parameter 

Linguistic Grade for Goodness-of-Fitting 


poor fair 

good 

very good 

excellent 

b 

0.00 0.00 

0.25 

0.50 

0.75 

d 

0.25 0.50 

0.75 

1.00 

1.00 

mode 

0.00 0.25 

0.50 

0.75 

1.00 


Note: Here the parameters a and B for all beta-M fuzzy numbers are set to be equal 
to 2.42. According to Juang et al (1992), in this case, these beta-M fuzzy numbers 
take the form of a ir-curve, a bell-shape bounded function. 


DISCUSSIONS 

It is observed that the modes of the membership functions of a,,, a„ and a, obtained in Example 
2 are practically identical to the coefficients obtained in Example 1 where standard regression was 
performed. As such, it might be speculated that a general relationship may be established between the 
range or dispersion in the membership functions of fuzzy variables D, S, and Q and the dispersion in the 
membership functions of the resulting coefficients, a„, a„ a* and R J . However, a series of sensitivity 
analyses performed in this study (not shown here) seem to reject existence of such a general relationship. 


533 



* 


• , a „ h a " stead v" result is about 1000 for the examples 


SUMMARY AND CONCLUSIONS Unear regression \$ presented. The 

A new approach and procedure for performing y JV . the setting of multiple linear 
on ft. JHE nteft«Hl *» £^££*5*- (mode,). ** «E m«*od 
regression. By treating the conventional regr While input data may be 

3K applied to perform the ty-w-J » c including the computauon dgontiims 
often the case in many real-t™rid „ be able to ww* S? “ ftS 

jsssrsr- " 2* - - — -*■ More -* ,s 

verify the proposed approach. 


acknowledgment 


Tta we » rfja 

S5S£^««K3iass sa~ 

EPA clay liner requirements listed in Tab e 


REFERENCES regression for resistivity Wuctivty 

. Bardossy, A., Bogardi, W. Ufayette, IN., 1987, „ 

relationships," Proceedings, W .E. * Geostatistics utilizing imprecise (fuzzy) i 

L F^^Sets^ ind^Systerns 31, 1989,^.311-328.^ wit h fuzzy model," IEEE 

3 o u ;& s s ; van 

8. Zimmermann, H.J., Fuzzy aeu>, 

Boston, 1987. 


534 




/C/&7 


N 93 -2957 5 


Incorporation of Varying Types of Temporal Data in a Neural Network 


M. E. Cohen*, D. L. Hudson# 

•California State University, Fresno, CA 93740 
#University of California, San Francisco, 2615 E. Clinton Avenue, Fresno, CA 93703 


ABSTRACT 

Most neural network models do not specifically deal with temporal data. 
Handling of these variables is complicated by the different uses to which 
temporal data are put, depending on the application. Even within the same 
application, temporal variables are often used in a number of different ways. 
In this paper, types of temporal data are discussed, along with their 
implications for approximate reasoning. Methods for integrating 
approximate temporal reasoning into existing neural network structures are 
presented. These methods are illustrated in a medical application for 
fo a grmsk of graft-versus-host disease which requires the use of several types 
of temporal data. 


INTRODUCTION 

Neural network modeling has received renewed attention in recent years [1]. 
Advances in both hardware and software have made the use of these systems for large-scale 
practical purposes feasible [2]. Neural network use is expanding rapidly in numerous 
domains [3-5]. Medicine has been a prime area of application of decision support systems 
based on neural networks for a number of reasons [6-8], including the difficulty of 
developing a traditional knowledge-based system for complex medical applications. A 
number of researchers have also investigated incorporation of fuzzy variables and 
techniques of approximate reasoning into neural network structures [9-14], including a 
number dealing with medical decision making [15,16]. Only recently has some attention 
been paid to the incorporation of temporal information in neural network models [17-19]. 
Temporal data have different interpretations depending on the application, thus general 
techniques cannot be successfully implemented without examining the ultimate usage of 
each of these variables. For example, the most straight-forward usage of temporal 
variables is in partial differential equations in which the time variable is clearly defined in 
mathematical terms and requires no further interpretation. However, only a few 
applications are well-understood enough to lend themselves to modeling through 
differential equations. For other less well-understood subjects, other approaches must be 
taken. 


535 








.7 




Wlis.-. K '~ 7 



One of the strengths of the neural network approach is that their basic stoctore 
architecture of biological nervous systems, concentrating on the structure of 
the individual neuron, as well as the massively parallel nature of biological nervous systems 
T2G1 Unfortunately the processing of temporal information is oidy partially! understood m 
Lhl'oloricalsense y, Se operation of short-term temporal influences csm be explamed by 
fnitibitorvand exdtatory Eiochemical influences at the synapses, which wawntfor Ae 
ilfllldlincof colliding signals within very short time intervals. However, the longer term 
hanHiino nf temnoral information, including memory itself, is still a major area of cognitive 
han lrS ° if/SXtunatelv the current level of knowledge pertaining to this aspect of 
SKgil J^n^c^piovide a clear moSel for handling temporal 

information. 

Tn the next sections, different types of temporal data are examined, followed by the 
, r incorporation of these variables into an existing 

the authors, followed by a discussion of 
the use of fuzzy variables to represent both temporal and state variables. 

types of temporal data 

Tn traditional applications, temporal data have been handled in a number of ways, 
HsancndlnJ on the ^SXSn. In well-clefined models, partial differential equations can be 

m reoreseS mmpSal tables in the same way £ state variables. Another valuable 
annroach is the use of state-space diagrams, using transition functions to lead from one 
state to another. In the development of decision making alMrithms for areas such as 
medicine in general not enough information exists either to define a differential equation 
or’ SateSoace diagrams. For applications such as this in which the majonty of 
available information is contained in accumulated databases, neural networks offers 
natural means for development of decision models. Toward this end, it is useful to analyze 
the manner in which temporal information is important to medical decision making, and it 
factTo mher areas of decision making which rely on numerous findings which are utilized 

to differentiate among categories. 

Temporal data can be divided into the following categories, depending on which 
aspects of the data are important! 

1. A Data: The change in value from the previous recording (examples: blood 
pressure, cholesterol); 

2. Normalized A Data: The change in value relative to the time interval (examples: 
weight gain or loss, hemoglobin level); 

3. Duration Data: The duration of time for which the finding persisted (examples: 
chest pain, fatigue). 

4 Seouence Data: A particular sequence of events (examples: fever occurring 
bXe rash occurring before generalized fatigue, noun occurring before verb 
occurring before adjective). 

Each of these variable types requires special handling, each of which is discussed in the 
following section. 



A. A Data, Normalized A Data, and Duration Data 

These data types can be handled in a straight-forward manner, according to the 
following schemes. Let n(tj) be the value of the nth variable at time tj, and let 

An = n(t{) - n(tj.i). (1) 

At = (tj-tj.i) (2) 

Assign a new node in the neural network for A data such that 


Pn = An. P) 

The ori ginal network is then expanded by the number of nodes required to accommodate 
the items for which the change is important. For normalized A data follow the same 
procedure as before, except let 

% = An/At. ( 4 ) 

Duration data can also be handled simply, by establishing 

r n “ A to * (ti - to) < 5 ) 

For duration data, the important parameter is the length of persistence of a finding. Thus 
the Atn in this case is the difference between the current time and the time to when the 
finding originally occurred. It should also be noted that time measures (e.g. minutes, hours, 
days, montnsTyears) should be normalized for each application. 


B. Sequence Data 

This is the most difficult problem in that a new variable cannot be created to deal 
with this entity. A major modification must be made to the neural network structure for 
q^n mmn d ating this type of reasoning. These data are handled by embedding a procedure 
at each of the sequence nodes. To analyze for the presence of a sequence, let fj, i=l,...,k 
be the ith finding out of k and let tj be the ith time interval. Define the square k x k matrix 

S = [sjj] where 

sij — 1 if fj occurred at time tj 

J 0 otherwise. (6) 

For a proper sequence, 
sy = 1 if i = j 

Thus tr [SI = k if the proper time sequence occurred, where tr [S] is the trace of the matrix 
S. The value of node u n according is then determined by: 


u n = 1 if tr (S[ = k 
0 otherwise 


( 7 ) 



IMPLICATIONS FOR APPROXIMATE REASONING 

The above constructs assume crisp input. The following modifications can be made 
to accommodate fuzzy input. 

A A Data, Normalized A Data, and Duration Data 

data tvoes there are two parameters which may assume fuzzy 1,1311 

crisp values* the time dependent finding n(n) and the time inter ^^ lf { £ ^semjwof 
PP/f.,.. tvnes- binary categoric, integer, or continuous. In fact, for thesetypes oi 

tSnmiml datKe vaS themselves are not important, only the differences m thc'^es. 

^is important, it is induced as a separate ^nodt »ta i^two^^the 

can be shown that if M and N are fuzzy numbers [21], then 


M e N = M e (-N) 

will also be a fuzzy number, where M e N is extended addition, and 
M 0 N = M ® (N' 1 ) 

likewise is a fuzzy number, where M® N is extended multiplication. 


( 8 ) 


(9) 


B. Sequence Data 

th<k o-nuence data, whether or not a series of events occurred in a given order is 
acrisp result However, th? degree to which the sequenceowmedmthe correct order 
can be considered. Instead of setting node u n as m equation (7), consider 


u n = [tr[S]}/k 


( 10 ) 


The definition provides a degree to which the sequence occurred in the required order. 
For example, consider the k x it matrix 


S = 


1 

0 

0 

0 


0 

0 

1 

0 


0 


0 

1 

0 

1 


0 


0 

0 

0 

1 


0 


... 0 
... 0 
... 0 
... 0 


(U) 


Then i. = fk-21/k. the degree to which the required sequence was met Each ^inthis 
mtrixrepresente a point intime, and each column represents a symptom. s;j - xiiattime 

i symptom j is present 


538 



EXAMPLE 


The method is illustrated on a problem for graft-versus-host disease (GVHD) taken 
from r221, and used as the basis for a recent workshop [23]. GVHD is a disorder which can 
occur after any kind of transplant operation, ranging from an organ transplant to tissue 
transplants, such as bone marrow. The disease exists in three forms: acute, chrome, and 
syngeneic. It is a complex disease in which changes in symptoms over time are extremely 
important for diagnostic purposes. The objective of the neural network decision aid is to 
determine if the disease exists in any of its three forms, or not at all. 


Fig. 1 shows a neural network for this problem. Nodes n t through njQ are standard 
nodes, pi through Pk2 are A nodes, qi through qfcj are normalized A nodes, r^ through rj^ 
are duration nodes, and ui through ufo are sequence nodes. The following are examples of 
each type of node for GVHD: 


nj : presence of total body erythroderma (standard node) 

ns : thrombocytopenia (standard node) 

pj: change in number of B cells (A node) 

q^: sudden weight loss (normalized A node) 

n: continued thrombocytopenia (duration node) 


u m : u m i: pruritic maculopapular rash (sequence node) 

Um 2 : gastrointestinal abnormalities 
U mV . liver dysfunction 
u m4 : bleeding 

Note that thrombocytopenia is important both for its presence and for the L jgth pf 
time for which it has been present. In this application, all times will be considered to be in 
months or fractions of months, and are given as offsets from the initial visit, which is 
considered to be 0. 

To illustrate, consider the following values for the above example: 
nj^ = 0.9 (degree of presence of total body erythroderma 
n^= 1.0 (degree of presence of thrombocytopenia) 

pj = Ani 3 = n(ti 3 >n(ti 3 -l) = 300 -170 = 130 (assumes crisp values) 

(change in number of B cells) 


q k = A nfc/A tj 4 = (nt |4 - nti 4 -l)/(ti 4 - tj 4 -l) (assumes crisp values) 
= (140 - 130)/(4 - 2) = 5 (weight change/month) 

q = % - tj^= 4-1 = 3 (assumes crisp values) 

(continued thrombocytopenia) 


539 





V 


u m : 


u mi u m2 u m3 u m4 


U = 


tL 


ti. 


1 

1 

0 

0 


0 

1 

1 

0 


u m = tr [U]/k = 3/4 = 0.75 


0 0 
0 0 
0 1 
0 1 


These values then become the values of the input nodes. Along with known classification 
values, the appropriate weighting factors are determined through the learning algorithm. 


The network is trained on data of known classification to determine weighting 
factors for each of these nodes, both from the input layer to the intermediate layer, ana 
from the input layer to the output layer. The result of the process is a differential diagnosis 
in which the degrees of presence of each form of the disease can be ranked. 


FUZZY NEURAL NETWORKS 

Another issue in the establishment of fuzzy neural networks is the role of linguistic 
quantifiers [24]. Considering the above example, the entry for "sudden weight loss" 
refers to the normalized A data node. Although the linguistic variable "sudden" is not 
handled directly by the neural network learning algorithm, this concept is adequately 
represented by the amount of weight loss over a given time interval. The algorithm uses 
this information through the supervised learning process to assign an appropriate weight to 
this finding. 

For the example in the previous section for A data, consider the B cell count. Due 
to inaccuracies in laboratory analyses, these results can be considered fuzzy numbers. If we 
assume each reading to be a fuzzy triangular number centered around the given values, 
with an experimental error of 5%, then the fuzzy values would be: 

n(tj^) = (285,315) 

n (ti 3 -l) = (161.5,178.5) 

Then equation (8) can be applied. Similar results may be obtained for the other variables. 

In order to handle fuzzy triangular numbers for standard nodes which do not 
represent A data, the algorithm for handling input interval data described in the next 
section can be applied. 


540 




Figure 1: Neural Network Structure Showing Temporal Data 


NEURAL NETWORK MODEL AND LEARNING ALGORITHM 

The heart of the neural network model is the learning algorithm. The topic of 
learning with fuzzy information has a long history, beginning with Wee and Fu’s 
consideration of a fuzzy automaton in 1969 [25]. Kautmann also considered fuzzy 
perceptrons in 1977 [26]. Zadeh suggested a different approach which used linguistically 
valued features [27]. Fuzzy isodata clustering algorithms have also been developed [28]. 
All of these approaches have relevance for neural network algorithms. 

Following the learning algorithm previously developed by the authors, the temporal 
nodes are added, as shown in Fig. 1. If the node is fuzzy, an interval approach is taken, as 
previously described [29], in order to accommodate all extreme values. The algorithm 
permits the input of binary, categoric, integer, or continuous data, as long as an ordering 
exists for the categoric data. Variables which are not independent can also be handled 
directly. A summary of the interval data handling is given here. 


Handling of Interval Data 

In order to handle interval data as input, the following is proposed. For a data set 
with n variables, define a vector 


* = [Oq.yiX (X2.y2)v.,(x n ,y n )] (12) 

where (xi.y;) represents the interval range for the ith variable. The values for foys) will be 
determined by the input data in the training set for the learning algorithm. The objective is 
to obtain a decision surface which will separate data at any point in the interval. This can 
be accomplished if the extreme values are accommodated. In order to do, this all possible 
combinations of interval endpoints must be considered. For a data set with n variables, 2 n 
combinations will be produced. A new set of 2 n vectors is then defined: 




»**«v*>*jw»* ***--* *:*."* 


a 


Zk = [zi^2»-.Zn] k = 1 i-»2 n ( 13 ) 

where zi e (xj,yi) 3 all possible combinations of x;,y;are generated for ij - l,...,n. The 
learning algorithm is run for each of the 2“ cases. The weights attached to the decision 
surface which produces the poorest classification is chosen in order to form a robust model. 


CONCLUSION 

The neural network approach for development of decision support systems offers a 
number of advantages, including easy development of the knowledge base. As illustrated 
above, temporal data of several types can be accommodated into the existing framework. 
Variables can assume either crisp or fuzzy values. The resulting system can be used alone, 
or in conjunction with a knowledge-based expert system to bring to bear all relevant 
information whether from expert input or databases. In order to implement practical 
systems using interval data with large numbers of variables, it may be necessary to utilize 
parallel processing to establish models. The models themselves can be applied to new 
cases n«n g standard sequential computers. Work is continuing in this area to streamline 
algorithms and to accommodate other types of fuzzy data into the system. 


REFERENCES 

[1] DJE. Rummelhart, J.L. McClelland, and the PDF Research Group, Parallel 
Distributed Processing , vols. 1 and 2, Cambridge, MA: MIT Press, 1986. 

[2] G. Carpenter, S. Grossberg, The art of adaptive pattern recognition by a self- 
organizing network. Computer , 21, 3, 1988, 77-8o. 

[3] SI Gallant, Connectionist expert systems. Communications ACM, vol. 31, 2, pp. 152- 
169, 1983. 

[4] B. Kosko, Hidden patterns in combined and adaptive knowledge networks, Int. J. of 
Approximate Reasoning, vol. 2, pp. 377-393, 1988. 

[5] J.G Bezdek, On the relationship between neural networks, pattern recognition and 
intelligence. International Journal of Approximate Reasoning, vol. 6, 2, pp. 107, 1992. 

f61 R.C. Eberhart, R.W. Dobbins, Using neural networks in hybrid medical diagnostic 
systems. Proa EMBS, IEEE, vol 13, pp. 1470-1471, 1991. 

[71 D. L. Hudson, M. E. Cohen, M. F. Anderson, Use of neural network techniques in a 
medical expert system, Int. J. of Intelligent Systems, 6£ 213-223, 1991. 

[81 G L. D’Autrechy, J.A. Reggia, G.G. Sutton III, S.M. Goodall, MA Tagamets, 
"Developing connectionist models with MIRRORS/H," SCAMC, vol. 12, pp. 276-280, 
1988. 

[9] D.C. Kuncicky, A. Kandel, A fuzzy interpretation of neural networks. Proceedings, 
International Fuzzy Set Association, 3, 1989, 113-116. 


542 



■*r*yn**m i. 


[101 I Havashi H. Nomura, N. Wakami, Artificial neural network driven fuzzy control 
and its application to the learning of inverted pendulum system. Proceedings , 
international Fuzzy Set Association, 3, 1989, 610-613. 

[Ill H. Takagi, I. Hayashi, Artificial neural network-driven fuzzy reasoning. International 
Workshop on Fuzzy Systems Applications, 1988, 217-218. 

1121 R R Yaeer. Modeling and formulating fuzzy knowledge bases using neural networks, 
Iona College Machine Intelligence Institute, Report #Mll-l 111, 1-29, 1991. 

[131 T Saito M. Mukaidono, A learning algorithm for max-min network and its 
application to solve fuzzy relation equations, Proceedings, IFSA, R. Lowen, M. 
Roubens, eds., 184-187, 1991. 

[141 m. M. Gupta, M. B. Gorzalczany, Fuzzy neuro-computational technique and its 

1 application to modelling and control, Proceedings, IFSA, R. Lowen, M. Roubens, eds., 
46-49,1991. 

[151 E Sanchez, Fuzzy logic neural networks in artificial intelligence and pattern 
recognition, Int. Conf. on Stochastic and Neural Methods in Signal Processing, Image 
Processing and Computer Vision, 1991. 

[16] A.F Rocha, M. Theoto, M. Theoto Rocha, Investigation medical linguistic variables, 
in Proceedings, IFSA, R. Lowen, M. Roubens, eds., 180-183, 1991. 

[171 b de Vries, J.C. Principe. A new neural network model for temporal processing, 
Proc. EMBS, IEEE, vol. 12, pp. 1439-1440, 1990. 

[181 J.A. Villarreal, R.O. Shelton, A space-time neural network, Int. 1. of Approximate 
Reasoning, vol. 6, 2, pp. 133-149, 1992. 

[19] J.L. Elman, Finding structure in time, CRL Technical Report 8801, Center for 
Research in Language, Univ. California, San Diego, 1988. 

[20] C. M. Butter, Neuropsychology: The Study of Brain and Behavior, Brooks/Cole, 
Belmont, CA 1968. 

[21] D. Dubois, H. Prade, Fuzzy Sets and Systems: Theory and Applications, vol. 144, 
Academic Press, Orlando, 1980. 

[22] J L.M Ferrara. H J. Deeg, Graft-versus-host disease, N. Eng J. Med., vol. 324, 10, pp. 
667-674, 1991. 

[23] DL Hudson, M.E. Cohen, Use of a hybrid expert system to diagnose graft-versus 
host disease, in Artificial Intelligence in Medicine, AAAI Spring Symposium, pp. 53-57, 
1992. 

[24] L. A. Zadeh, Fuzzy sets as a basis for a theory of possibility. Fuzzy Sets and Systems, 1, 
3-28, 1978. 

[251 W G Wee, K.S. Fu, A formulation of fuzzy automata and its application as a model 
of learning systems, IEEE Trans, Syst., Sci, Cybem, vol. 5. pp. 215-223, 1969. 


543 


[261 A. rmdmmrn. Introduction it ia Theotie des Somgnsetnbles Flout, voi 4: 
Compliments et Nouvelles Applications, Masson, Pans, 1977. 

cm LA. Zadeh Fuzzy sets and their application to pattern dassification and cluster 
1271 C&athn and Ousted £. Academic fos, 1977. 

[28! J.C Dunn, A fuzzy relative of the iSODATA proce* and its use in detecting compact 
1 well-separated clusters, J. Cybem ., vol. 3, 3, pp. 32-57, IV / 4. 

[29] Cohen, M.E„ Hudson, DJL, Approaches to the handliMof fuzzy input data in neural 
1 J networks, IEEE Conference on Fuzzy Systems , 93-100, 1992. 


544 





~ / 


N93-29576 






\ ; 
'J 


FUZZY OPERATORS AND CYCLIC BEHAVIOR IN FORMAL NEURONAL 

NETWORKS. 

Fuzziness may lead to chaotic dynamics. 

E. L4bos*, A.V.Holden^, J.Laczk6^, L Orzd*, A.S. Ldbos 1 
1: Semmelweis Univ. Med! School, 1st Dept, of Anatomy, Neurobiology Unit of the 
Hung. Acad, of ScL, 1450 Budapest, Tfizolt6 u. 58, HUNGARY; 2: Ine Univ. of 
Leeds, Center for Nonlinear Studies, Leeds LS2 9JT, UK; 3: Ludwig Maximilien 
Univ., Klinikum Grosshadem, Neurologische Klinik Munchen, GERMANY and 
Central Res. Inst, of Physics of Hung. Acad. Sci., HUNGARY 

ABSTRACT - In formal neuronal networks (FNN) built of threshold gates, a unit 
step function is applied. It is regarded as a degenerated distribution function (DDF) 
ana will be referred to here as a non-fuzzy threshold operator (nFTO). Special 
networks of this kind generating long cycles of states are modified by introduction of 
fuzzy threshold operators (FTO) i.e. non-degenerated distribution functions (nDDF). 
The cyclic behavior of the new nets is compared with the original ones. The 
interconnection matrix and threshold values are not modified. It is concluded that the 
original long cycles change: (1) fixed points, (2) shorter cycles or (3) as computer 
simulations demonstrate, aperiodic motion or chaotic behavior appears. The 
emergence of the above changes depend on the steepness of the threshold operators. 

INTRODUCTION 

A formal neuronal network (FNN) means now more than a McCulIoch-Pitts 
network (1943): - (1) - the states of units and nets are fixed; - (2) - an interconnection 
matrix is fixed and synthesized through some process, here not by "learning"; - (3) 
thresholds are specified for each unit;- (4) a threshold function (a unit step function 
or a softer "S-shaped function" is finally applied. Thus the computation of the new 
network state is as follows: 


s old = s — >sM — >sM-©- — *T(sM-0) = s new (1) 

The sequence of Sj states can be generated by the iteration of this N network 
mapping which incorporates: (1) - interconnection matrix M;- (2) - threshold vector 0 
; and - (3) - threshold operator T. The networks may differ from each other by these 
objects. In learning processes an (M,0) sequence is generated in the hope of reaching 
a fixed network. In case of simulated annealing related to Boltzmann machines the 
steepness of a T - like function is changed to reach the limiting result. 

In this study, the matrix M and threshold vector is fixed, the iteration is the only 
change. External input vectors which would transform the machine into a non- 
autonomous (open) system are not introduced. Learning (or adaptive synthesis) is not 
present. The network are however, very special. They were originally (L4bos, 1980- 
1987) synthesized in order to generate long finite cycles or to design networks of 
minimum number of required elements to a given length of cycles. The aim is to 
investigate the influence of the thresholds operator T to the length of cycles which 
appear during the iteration. For this reason "S-shaped" (still monotonic) operators are 
introduced as it is done in the "neurocomputer-science". This is the "fuzzy" aspect of 
the study. 


545 



1. MULTIPLE AND PRESENT MEANING OF FUZZY OPERATORS. 

Among the possible meanings of 'fuzzy- objects (sets, logic, gtatniM, iajmageA* 
Droerams. environment, graphs, topology, etc.; see in Zadeh et aL, 1975) only the 
Following destination will be applied: as a membership function or degenerated 
distribution function. The non-fuzzy possibility is as follows: 


T 

u(r 


= 1 if r 0 and 
= Oifr 0 


( 2 ) 


Such a function is used m threshold logic or formal neurons Md m their ne^ojfe 
by coordinates. Since the 70-ies the concept was extended to S-shapcd fonrt ons or 
the application of such non-idealized steps became a necessity. However, the 

“STf S toSSm ' cells as special "measuring devices' mcluded 
in fteri M a medal measuring procedure, generating a 'measure space' (MS) in 
lteS7se^?Ubos, 1988; Halraos, 19?4) TbeJdS-s are dosely related to 
distribution functions. Thus this generalization is plausible. 

DEFINITION: - A real valued function F is called a distribution fimction if the 
following conditions are satisfied: - (1) F is increasing monotomcally; - (2) - 0 t l ; - 
(3) F is 8 semicontinuous from the left, i.e. lim F(r) - F(0) if r tends tozero from 
negative direction. The opposite continuity is not demanded, but permitted, * (4) - 
Ontionallv the differentiability of function F is also supposed. 

REMARK: All distribution functions, including the discrete ones, belong to this 
category. Certain text-books demand one sided continuity from the opposite direction 

whicn does not make essential differences. . , . . _ , 

DEFINITION: A degenerated distribution function (DDF) is the fancuon defined 
in eq (2) and widely used in threshold logic. It is here called non-fuzzy threshold 
ooerator (nFTOi An arbitrary, non degenerated distribution function defining a 
Lebesgue-Stieltjcs measuring space may have the name of fuzzy threshold operator. 

REMARK: The "fuzzy" attribute makes the nomenclature applied in measure 
theory logic or for membership-functions uniform. The semantical background is 
arbitrary Eg. a response curve which may occur in a single natural or artificial 
neuron can be regarded either as a temporal average or it may represent a response 
of population of cells. A normalization to remain between O and 1 is useful, but can 
be omitted. 

2.METHOD: COMPARISON OF BINARY THRESHOLD. 

GATE WITH FUZZY THRESHOLD GATES (FTO-s). 

The FTO-s or non-DDF-s applied here are as follows: 

Ti(x) = e^Al+e 1 ®) = 1/(1 + e'k) (3) 

T 2 (x) = 0.5 + 05(e kx - e' kx )/(e kx + e' 1 ™) ( 4 ) 

The binary threshold gate nets used here belong to a rather special class of 
networks (Ldbos, 1980-1987). These are capable of generating transient-free 
behavior. In a more special moreover rarely occurring - i.e. non-genenc - case the 
networks may generate long or even maximal cycle lengths. This means m an n- 
neuronal nets L = 2 n , the number of binary vectors in the state space. The nets were 
synthesized on the basis of a Theorem (Ubos, 1984, 198/) and were searched with 
computerized selection. Examples are presented in the quoted works from n = 1 to 


546 








n = 9 dimensions (the number of neurons in a network). These finite nets in the 
actual autonomous case are non-chaotic. We will see however, that the behavior of 
these nets may become suspiciously chaotic as soon as FTO-s like Tj and T 2 are 
introduced. In Fig 1. - 3. the so called code-trajectory of such networks is presented - 
as a reference - which is the diagram built of the consecutive states as decimally 
coded numbers based on the separate vectorial states of the net (e.g. code(OHOll) - 
27 and n = 6. The diagram consists of the lines of (jgc)-(x,y) and (x,y)-(y,y) where x 
and y are successive state codes. This is simple of a method representation, similar to 
Poincare and Lamerey diagrams used in dynamics. In "chaotic" cases the next state 
plot of (x,y) pair of state is used only by coordinates. 

3. BASIC OBSERVATIONS. (Figure 1-6). 

The observations refer to the FFO cases, since the binary case is more explored and 
plays here the role of reference for the new behavior. The comparisons of the two 
situations are here the essential methodical and conceptual procedure. The computer 
simulations of which examples are given show that the "exponentially long (L = 2 cn ) 
or wiavimal (L = 2 n ) cycle lengths of state flows change radically if fTO-s are applied 
instead of tne unit-step-funcnon. If the parameter k in functions (3) or (4) are 
suitable an originally long cycle of a coordinate-flow • after transient states - may 
become a fixedpoint, a pair of fixed states or four clusters or four points. The four 
clusters occur at higher values of k and corresponds to the (0,0), (0,1), (1,0), (1,1) 
quaternio of pairs of successive coordinates of state vectors. Their transitions (i.e. a 
next-coordinate plot) in the binary, non-fuzzy case are not so interesting since only 
the few(8) transitions may occur: (0,0)-(0,0), (0,0)-(0,l), (0,1)-(1,0), (0,1)-(1,1), (1,0)- 
(0,0), (1,0)-(0,1), (1,1)-(1,0), (1,1)-(1,1). In the fuzzy case (see all Figures) the state 
space becomes a continuum set and not only the coded vectorial flows shows 
interesting picture but also the coordinate-flows. For this reason and also because of 
hard representation, these phase-diagrams by components were displayed. 

Bifurcation diagrams with the control parameter k of the steepness are also 
fabricated. The usual routes to chaos-like dynamics can be demonstrated. Such 
systems include n 2 +n+l numerical parameters because of the matrix, threshold 
vector and S-shaped operator where n is the dimension. E.g. at n=9, 91 different 
bifurcation diagrams are possible. 

The study or non-monotonic operators instead of the distribution functions or 
threshold operators here is neglected since by changing the norm of matrix and 
threshold or value of k. Thus the working domain remains inside a bounded set. 

The insight which can be gained from the various diagrams is a possibility of 
categorization of the diverse dynamical behaviors. 

4. SHORTER CYCLE LENGTHS WITH FUZZY OPERATORS. 

The most radical shortening of the cycles is the case when a maximal finite cycle 
becomes a fixed point. This occurs at very small absolute values of k. "Very small" 
seems to be different at different values of dimensions of the state vectors. Usually at 
higher dimensions smaller k-s still are capable of displaying complicated dynamical 
behavior. 


5.0CCASI0NAL LOSS OF PERIODICITY BY APPLICATION OF FUZZY 
OPERATORS. 

As we cannot analytically prove that in such complicated dynamical systems which 
are presented here an aperiodicity in fact occurs, therefore the statement of the 
emergence of chaos is based on computer experience. This is a frequent situation in 




547 






Tk an undecided question whether in the cases which became chaoti^aperiodic), 
J 5 (Stable periodic attractors depending on the initial value 
£»”££? nrit is7™esdo<wl.at species of chaos occur (see <* m Holden, 
1986; Kohda and Aihara, 1990, etc.). 

6. OTHER EFFECTS. 



cycle g< 
k could 

„^Ta!^c«SrSe itolffa often observed 6 * such cases that the behavior is 
vibrating. Regular jumps occur between two cycles and therefore two cycles may have 
a unified basin of attraction. 

7. ADVANTAGE OF FUZZY OPERATORS OR NOT? 

The formal neuronal networks with die presented 
thresholds are suitable for coding or for economical (small network) control of 
laSe number of effector organs. The advantage of fu^y decisions for 
wSchdiis generalized truth functions can be used (see in Z^eh-Fu-T^aka-Shimura, 
19751 Smoared with the binary or many-valued logics is not yet completely explored. 
No doubt ^he finite valued logical decisions can be played back to tne binary case at 
least ^tactically. The numerous values are justified if more than two meanings can 

^^o^veMn^ec^e^continuous operators, the number of possible decisive cases 
becomes infinite or moreover continuum set. In 

a handicap or can be tolerated. Tolerance may be introduced by digital (suaeue- 
shaped)" decision operators dividing again the domains into sharply aistinguishable 

^The^concept of fuzziness seems to be more applicable in «yg^Jworrf those 
which were touched by this work (see in Zadeh, 1975; Bezdek and Sarkar, 1992). 

No doubt the present form of "soft threshold logic" as a continuous generalization 
of binary threshold logic and its relationship to the chaos appears to be a most 
promising theoretical subject. At the sam e time it might occur that chaos caused by 
the introluction of fuzziness or non-degenerated measuring operators may restrict 
the range of possible applications. 

8. DISCUSSION 

The main conclusion is tot fuzziness - which “ W Be . d ^ 

neurocompUters - may introduce chaos (or even confusion, Mendes France, 1989) 
into the behavior of formal neuronal net or neurocomputer. After synthesis a tuning 
of operator is required to avoid chaos and implicattons. However, the transition 
between the sharply decisive dynamics of finite binary systems and chaos can be 
controlled by the slope parameter k. Chaos may occur m model networks (Labos, 
1986- Derrida and Meir; 1988), the connection between the two paradigm merits 
attention. The message for neuromputer science is that it is not sufficient to 


JL 


synthesize a net let say by learning process, but still is necessary a tuning of the 
threshold operator. The real neural systems, display aperiodic but stable behavior. 
The presence of chaos in real nervous systems seems to be plausible (e.g. Freeman, 
1987). But stability and reliability require deeper explanations. 

ACKNOWLEDGEMENTS: This work was supported by OTKA 2621/1991 and ETT- 
T032/1990 Grants. 

9. REFERENCES. 

Bezek, J. G and Sarkar, K.P. (1992): Fuzzy Models for Pattern Recognition. IEEE 
Press, Piscataway. 

Carpenter, G. A.(1989): Neural Network Models for Pattern Recognition and 
Associative memory. Neural Network 2:243-257. 

Derrida, B. and Meir, R. (1988):Chaotic behavior of layered neural network. Phys. 
Rev. A. 38(6): 3116-3119. 

Freemann, R. (1987): Dynamical Models of Neural Function that Generate Chaos. 
IBRO-1987-Budapest, Neuroscience Suppl. to 22: 297S. 

Grossberg, S. (1988): Neural networks and natural intelligence. Cambridge, MA^ 
MIT Press. 

Halmos, P.R. (1974): Measure Theory. Springer, Berlin. 

Holden, A. (Ed): Cnaos. Manchester UP, Manchester. 1986. 

Kohda, T. and Aihara, K. (1990k Chaos in Discrete Systems and Diagnosis of 
Experimental Chaos. Trans. DEJCE fe 73: 772-783. 

Ldbos, E. (1975): On the Significance of Wiring and Language in the Behavior and 
Information Processing of Neuronal Networks. In MTA Biol. Oszt Kozl. 18: 325-340. 

L&bos, E. (1980): Optimal Design of Neuronal Networks. In Neural 
Communication and Control. (Eds.: Sz6kely, Gy. & al.) Adv. Physiol. Sd. 30: 127-153. 

L&bos, E. (1984): Periodic and non-periodic motions in different classes of formal 
neuronal networks and chaotic spike generators. Cybernetics and System Research 2 
(Ed.: R. Trappl), Elsevier. Amsterdam, pp. 237-243. 

Ldbos, E. (1987a): Chaos and Neural Networks. In: Chaos in Biological Systems 
(Eds.: Degn, H., Holden, A. V. and Olsen L. F.), NATO ASI Series A. Plenum Press. 
New Yorkjpp. 195-206. 

Ldbos, E. (1987b): The most complicated networks of formal neurons. In 
Proceedings of IEEE 1st International Conference on Neural Networks. San Diego 
1987, (Eds.: M. Caudill and Ch. Butler. Vol III, Network Architecture DL pp. 301-308. 

Ldbos, E. (1988): Asynchronous Versus Synchronous Computation in tne Nervous 
System and Their Models. J. Molecular Electronics 67-77. 

McCulloch, S. W. and Pitts, W. (1943): A logical calculus of the ideas immanent in 
the nervous system. Bull. Math. Biophys. 9: 127-147. 

Mendes France, M (1989): Chaos implies confusion. In London Math. Soc. Lecture 
Note Series. Cambridge U.P., Cambridge. 

Metropolis, N., Rosenbluth, A. W., Teller, M. N., and Teller, E. (1953): Equations 
of state calculations by fast computing machines. J. Chem. Physics, 21:1087-1091. 

Simpson, P.K. (1990): Artificial Nervous Systems. Pergamon, London. 

Zadeh, UA., K-S, Fu, Tanaka, K. and Shimura, M. (1975); Fuzzy Sets and Their 
Application to Cognitive and Decision Processes. Academic Press., New York. 





§ 


MATRIX: 

1 1 1 - 1-5 l'l 
-511 1-1 l " 1 
-111 1 - 1 - 5-1 
- 1-5 1 l " 1-1 ' 1 
- 1-1 1 I -*' 1 5 

- 1 - 1-5 1 - 1-1 1 
- 1 - 1 - 1 - 5 - 1-1 1 

THRESHOLD: 

- 5 - 3 - 1 - 1" 6 " 4 1 






ssffl 

.fgivenS lu"4 operator Ti- The value of slope factor « - . • 


A 




FIGURE 2 - Network of eight units. Next state plots. (1) Upper left: The binary 
reference code trajectoiy of L = 256 length maximal cycle. (2) Upper right and 
later: Fuzzy operator of Tj is applied. The matrix and threshold is inserted into the 
upper right frame, k = 0.865; component 1; (3) k = 0.72; component 4; (4) k = 0.8; 
component 1. 


551 


■ a; v ' 


lir 

i ■ m ~ 

Mi 

m m ~ 

m 




l i 


it r - - - - 


- m ■ 

iil 


<■ I! 


ill 

•m imi 

ill 


i m 


i-1 „4_i~J..i UJJ.JJJ J i j. 

‘,>Sv- V 


iJ4-ll-4ll-lli 4-4 J 


T 1 J 1 l 111 I 1 t T I T 1 T l ^ 11 1 1 ,r,ni 1 1 1 1 


FIGURE 3 - Nine units. Next state plots. - For the non-fuzzy threshold operator: L = 
512 (maximal). The matrix is in Figure 4. For fuzzy cases the components are 
successively as follows: 5th, 3rd, 4th. 





579 2 43168 
... MATRIX: 

1-1 1-1 1-1 7 1-1 
1 - 1-1 7 1-1 1 1-1 
1 - 1*1 11711-1 
1 - 1-1 1-7 1 1 1-1 
7 - 1-1 1-1 1 1 1-1 
1 - 1-1 1-1 1 1 - 7-1 
1 7-1 1-1 1 1 - 1-1 
1 1-1 1-1 1 1-1 7 
1171-11 1-1 1 
THRESHOLD: 
3106-55 7-3 0 


FIGURE 4 - FTO-next-state-plots. - Same as in Fig 3 but the flows of component 
6th, 7th, 8th and 9th are displayed. In all cases the FTO is Tj and k = 0.7. 






FIGURE 5 - Componentwise temporal diagrams: - (A) * n = 9, B-matrix, 579243168; 
40; 0; k = 0.7. - (B) - n = 13; B-matrix, 123456789ABCD; 0; 0; k = 05 . (C) - n - 13; 
B-matrix, 123456789ABCD; 7; 0; k = 0.5. - (D) - n = 13; B-matrix, 123456789ABCD; 
5207); k = 0.5. See also Fig 6. 


FIGURE 6 - Bifurcation 

diagrams. Control parameter is 
k. (A): n = 6; c = 2; Matrix A; 
cp = 351264; ng = 4; is = 0; 
(B): n = 8; c = 4; matrix A; 
cp = 42856137; ng = 6; is * 0. 
Non-fuzzy cjde length is 
maximal. Values on Y-axis are 
between 0 and 1; X- axis: goes 
is k and goes from -0.5 to +0.5. 
More detailed matrix speci- 
fications in Fig 5 and 6 see in 
the quoted publications of the 
first author. 








». / 


-(e>2> 



N93-29577 


NEURAL NETWORKS : A SIMULATION TECHNIQUE 

UNDER UNCERTAINTY CONDITIONS 
M. Luisa Nicosia Me AUister 
Mathematics Department. Moravian College. 

Bethlehem. PA 18018 
Tel 2158653187 

\ 

Abstract 

THIS PAPER PROPOSES A NEW DEFINITION OF FUZZY GRAPHS AND SHOWS HOW 
TRANSMISSION THROUGH A GRAPH WITH LINGUISTIC EXPRESSIONS A S LABELS PROVIDE 
AN EASY COMPUTATIONAL TOOL. THESE LABELS ARE REPRESENTED BY MODIFIED 
KAUFFMANN FUZZY NUMBERS. 


$1 introduction 

Ever since F. Harary introduced the concept of 
implication digraph in his 1965 text/ much of the theory 
developed has been of interest to applications involving 
transmission. In this era of knowledge engineering, 
artificial intelligence, and neural networks, the interest in 
graph theory has grown because it provides a source for 
problem solving techniques; see[15, 18,21 ] . How do we include 
uncertainty in the representation and evaluation of 
transmission through a network? To answer this question, it 
is necessary to review some basic terms associated with well- 
known techniques used in the evaluation of the flow through a 
special type of graoh. we want to emphasize that in using 
[13,141 the mathematics needed to incorporate uncertainty 
leads to easily applicable techniques which necessitate the 
discussion of fuzzy graphs. Thus we propose here to combine 
the principles of fuzzy set theory with those of graph 
theory. This combination may be applied to the problem^ of the 
evaluation of a transmission through a neural network. What 
imprecision do we have here that it is not handled, by 
probability means? Because of the vagueness and uncertainty 
which occur in the simulation of realistically complex 
situations, we have to resort to techniques which can handle 
the vagueness of linguistic assessments. The modeling of 
neural networks is thus proposed by assuming that the 
concept of vertices being members of the vertex set and of 
arcs being members of the arc set is not crystal clear. It is 
susceptible to imprecision because of uncertain numerical 
evaluations, see [ 13 ] , or because of linguistic assessments, 
see [14]. Thus using an approach according to fuzzy set theory 
in [12], it is possible to generate a simulation representing 
the imprecision, which is not of a probabilistic nature. 


§2 *«» sic terminology 


555 



The term graph in this context will be used to mean 
directed graph or digraph; see [1). 

Def ini tion 1* A graph is called stochastic if the following 
information is associated to each arc: 

(1) The probability that this arc is selected; 

(2) A random variable, such as time, is associated to each 
arc. 

Definition 2 s A stochastic graph is called a f lowaraph if 
there exists a sink and a source, and if two boolean 
operators are associated to each vertex. The two operators 
are usually the and or the ORELSE. 

Why do we need these operators? These operators control 
the flow between vertices. The evaluation of the flew is one 
of the problems of interest when flowgraphs are implemented 
in a simulation procedure. Generally, in this type of 
applications, each arc entering a vertex represents an 
activity to be completed; see [2c}. When the AND operator is 
associated with that vertex, it means that no new activity 
can be pursued until all previous activities are completed. 
When the boolean operator is the ORELSE then only one of the 
entering activities must be completed before any new activity 
can be pursued. An additional requirement for a digraph to be 
called a flowgraph is that there ought to exist two special 
vertices. Recall that in graphs where more than one arc 
enters or leaves a vertex, we define for each vertex its 
indearee . the number of arcs entering that vertex, and its 
outdegree, the number of arcs leaving that vertex. 

Definition 3: If there exists a vertex whose outdegree is 
equal to zero then it is called a sink . A vertex whose 
indegree equals to zero is called a source . 

A flowgraph can concisely be defined as a stochastic 
digraph with two special vertices; a sink and a source. 
Flowgraphs have been successfully used to model the execution 
of activities as depicted by the digraph. It is then a 
natural extension to investigate their use when the network 
under investigation is a neural network. Surprisingly, no 
research efforts in such direction sure known to this author. 
However, this is not the focus of this paper. As stated in 
the abstract, we propose here the use of fuzzy graphs as a 
technique to experiment with. The transmittance through a 
flowgraph were considered and solved by several authors, 
primarily using Mason's rule which is the best known; see 
[2]. A brief review of Mason's rule is given in the next 
section with some details. For additional details on 
algorithms and examples see(2,2a,2b, 3] . 

§3. Path Transmit taace in Flowgraphs: Mason’s Rale 

Let R be an n x n matrix where the value of each entry 
rij depends on a random variable. These values are obtained 
from the characteristic moment-generating function for the 
distribution of the random variable. Let P be a n x n 
probability matrix where pij equals the probability that the 


556 



arc (i,j) is selected. As a brief summary, basically, the 
computation of the total transmittance along each path 
requires the search of all paths m the flowgraph from source 
to sink to be completed first. If we construct a new matrix, 
called the transmittance matrix , and denoted it by T s ftij) 
where each entry is the product of the a random variable with 
the corresponding probability which is associated to that 
arc;- namely tij is the product of rij and pij. If we assume 
that the search of all paths from source to sink has already 
been made so that we know all the paths and that there are q 
paths from source to sink, then the total transmittance is 
computed according to Mason's Rule 

f Wk (1 - det(Tj c )} + (1 - det(T) ) , 

3c*l 

where T k is a submatrix of T which is obtained from T by 

removing from it the row and the column that correspond to 
each vertex in the k-th path. The quantity W* is the path 
transmittance of the k-th path. Let the fuzziness of each set 
be measured according to [4]j a method specifically designed 
for graphs . 

$4. Fuzzv Graphs . . . . 

The first to consider fuzzy graphs were A. Rosenfeld in 
[5, 6, 17], and R.T. Yeh with s. Y. Bang whose work is also 
included in reference [17]. Most authors, including this 
author in [2b, 2d], defined a fuzzy graph as simply a graph 
whose adjacency matrix is replaced by the membership matrix M 
= (m..) under the convention that if the entry m^j * 0, then 

the arc (i,j) has m^ as the evaluation of the membership. 

Th6r@ are two types of evaluations* the numer ical , where we 
generally have 0 < m^j < 1, and that based on a functional 

interpretation . namely, m^j * 0 is a mapping, or more 

specifically a fuzzy number. 

Definition 4: A fuzzy number is a convex, normalized fuzzy 
set. At the conclusion of this work the suggested fuzzy 
number will be denoted Z n . We define a fuzzy graph as 

follows: Let V be the support set of the vertices. Let 

: ve [0,1]. 

Then the fuzzy vertex set is denoted by 

V f = (V, n^). 

Similarly, let the fuzzy arc set be denoted by A* = (A, m A ) 
where m & maps the support set A, which i3 the crip subset of 

the cartesian product of V with itself, into the interval 
[0,1]. A fuzzv graph is then the pair of fuzzy sets, written 

ag> q*= (V*, A* ). in all of the above and in the sequel, the 
superscript f is used to remind us that the set is assumed to 
be fuzzy, in this way, if m^- 0 then it means that there is 


557 


nr. link between the vertex i and the vertex j . If the 
ection between the two links is not crystal clear then 
n^j i3 either equal to some value between 0 and 1 or it is a 

fuzzy number. 

A fuzzv qraph was at first defined as a labelled graph, 
often without® evlr clarifying the necessary aigetea^a 

. __ i, ei nehra to use here is the MIN or tne max 

operators which are commutative, associative, and 
For a full discussion on path algebras with 
22SS fl 9 ;pp. 85-88]. Note that the set of labels can 

SSSSt of SiniricS values or of functional interpretations 
.r linauistic expressions, in the former case, i.e. for 
f 1 fr 9 a i nations the use of the MAX or MIN operators is 
irli^ftrward If functional interpretations of 
linauistic expressions are used as label, then fuzzy numbers 
aie 9 Stld BowSir, in this case, the use of the MAX and MIN 
operators is not simple to use unless we devise a method for 
ranking fuzzy numbers. 

a totally different definition of fuzzy graph is 
proposed here after a brief review of the homology of graphs. 

Ba nit ground on thft HoP olo^Y 2^ Graphs , 

55 Given a classical g?aph G - (V ,A), we recognize two 

vector spaces. The first is the vertex vector space ., V ,and 

the second is the *>daes vector spfrce . A . f 

r><>elnition5 A fuzzv Graph is a pair of vector spaces 0 = 

(V f a *) . is this definition conflicting with the previous 
nne? No it simply identifies V f and A 1 for what they 
Usually are: two Sector spaces with the vector space 
operations defined by 

(m+h) ( v) - m(v) + h(v) and (am)(v) = a m(v) 

for every v in v and for any real or complex number a. The 

dimension of the vertex vector space v equals the cardinal- 
ity of V. Note that in such case, the mapping my maps the set 

Of vertices V into the set C of complex or real numbers and 
?he mapping m A maps the set of links A into C. Denote these 

vector spaces respectively by v f and A f . Thus a fuzzy graph 
is a pair of vector spaces, G f = (vSa 1 ). Having stated that 
the uYe of the MIN and MAX operators is difficult when 
fimr+innai expressions are associated with each arc, it seems 
necessary 1 to^investi^ate how it is possible to solve the 
difficulty . some authors have worked successfully on the 
ranking of fuzzy numbers; see [4] or the many pa^rs by S . 

aara ua propose the adoption of a somewhat 

easier ^o^ution 6 because STspecial typl of fuzzy numbers we 


adopt. First we define the fuzzy numbers which will be used. 
They are called the Kauffmann integers. 

£7 reasons for the term: Kauffm an integers 

First,' they are called Kauffmann as they were introduced 
in [201. secondly, they are called integers because Kauffmann 
shows in his book that they form a Peano system. A brief 
review, for clarity of exposition, is given in Note that 
this derivation is the same as the one given in [20]. The 
only new content given here consists of the propositions 
below and the fact that since the convexity requirement is 
satisfied but not the normality requirement. This is included 
in the definition of a fuzzy number. The functional repre- 
sentation is changed slightly. 

§8 The Kauffman integers 


Kauffman's 


in this section a brief review is given mostly following 
derivation, first, a set denoted (K ) a is 

constructed for which we then derive some useful properties. 
They are called integers because Kauffmann also shows, 

see[20], that by defining a suitable operations 

essentially like the set of whole numbers N. 


<**>« 


is 


£9 Thft construction of the set (K*) a 

Let the elements of a set (K f ) a be denoted by 

( Rf ) a = { K i/ K 2' K 3' ' K n' * 

where the subscript a is used because it is a parameter; any 
positive real number may be used. In this section, the 
elements K n will be derived by defining a unary operation 

called the * successor’ operation. First we will define Ki and 
then we will obtain K n recursively from Ki and K n -1 for n a 
2. Then an explicit expression will be obtained for Kn* 

Construct|mij[isJSSSS£24ffl: Let the first element Ki be 
defined according to the following: f x (x) = a e"®*, a > 0 and 

let 

K x = { (x , f^x))}. (1) 

Denote the next element by 

K 2 = { (x ,f 2 (x) ) : x G [ 0 ,m) >, 

where the function f 2 (x) is computed according to the 

following procedure: 

f 2 (x)= { f 1 (t)f 1 (x-t)dt = |ae“ at e‘ a(x_t) dt = a 2 X2 _C0C (2) 


559 



for x G [0,»). Any element of the set (K*) a is computed 
recursively according to the following: 

- f n< x > = t f (n -l)( t ) f l< x - t ) dt (3) 

with 

K n = { (x,f n ( x )) ! x e 1 0, «>)> 

for n > 1. Note that (3) essentially defines the desired 
recursion operation to derive the elements of (X* ) a . 


(B) Construction of an explicit Rxpresslon tBoth K 1 and K 2 
have an explicit formula for f 1 (x) and f,(x). Can explicit 
formulae be found for the other elements, Kjj « (x, f n (x))? 

If so, we must be able to find an explicit expression for 
f Q (x) . Then, after a graphical interpretation, a few other 
facts will be established so that comparison operations can 
be easily identified and fairly simple to code. 

Proposition 1 : For all n a 2, we have 

, a^n-le-coc 

f n* x * ” (n-1 ) 1 

Proof: The expression of f n (x) holds when n = 1 and n * 2. 
Assume that it holds for (n-1) so that we have 

a n “ 1 x n “ 2 e” ax 

f n-i( x ) “ ( n-2 ) J 

From the recursion definition it then follows that 

f n< x > = J f (n-1) ( t ) f i( x_ ' t ) dt = e_ax | tn * 2 *• 

Completing the evaluation of the integral we obtain 
precisely the statement of the proposition. 

The advantage of the explicit form is that we can find 
the sketches of the membership function and its extrema and 
so might devise a method for ranking these numbers in the 
easiest possible way. 


Proposition 2 : For n > 1 each fn(x) has an absolute maximum 
M n / with 


Mn - 


« (n - l) 11 - 1 
(n - 1)1 e a-1 ' 


and this occurs at x = — — — . 

Proof: Letting the derivative of fn(x) equal to zero gives 

the equation n-l-ax=0 which has the desired value. 


560 



Since the second derivative of f n (x) is negative for this 
value of x, we have a maximum. It is an absolute maximum 
because lin^^^ f n (x) = o. 

Note that this maximum needs not equal one. 

Proposition 3 : The sequence {M n >, n > 1, is monotonically 
decreasing. 

a a 2 

Proof: We have M2 * — and M3 = — — . Therefore, M2 > M3. To 

show that the statement of the proposition is true, we must 
show that Mr < Mk-i for all k > 3 . Namely, we must show that 

(k - l)* -1 (k - 2)f Jc “ 2 ) e-ffc- 2 ) 

(k - 1)1 (k - 2)1 ' 

To show that it is monotonically decreasing we have to show 
that the inequality Mk > Mk+i holds for all k. Specifically 
we must show that the inequality 

(k - e~( Ic “ 1 ) ^ (k)* e-(^) 

(k - 1)1 > (k)l ( 5 ) 

is true by algebraically reducing it to a true statement. 
After simplification and taking the natural logarithm of each 
side, we have reduced the proof of the inequality (5) to the 
verification of the inequality 

<k- l)ln - + 1 > 0 or (k-l)ln - ~ - 1 > -1 (6) 

If (6) holds, ( 5 ) is true. Let p(k) « k In ^ ■— ■ , «- . since 

P*(x) = > °» then P"( k ) > 0 because k is a positive 

integer. In addition, lin^*, p(k) = 0 and it follows that 
p(k) > - 1 . Thus (6) is true. 

To visualize the elements of (K 2 ) a , namely some of the 

pairs (x,K n ) * it suffices to sketch the functions f n (x) in 
the first quadrant since x, n, and a are all non-negative. 
The sketches in the figure at the end of this section provide 
a graphical interpretation of three elements: the fuzzy sets 
K 1# K 2 , and k 3 . in fact, the regions on the plane over the x- 

axis under the curves corresponding, respectively, to f^x), 
f 2 (x) and f 3 (x). 

§10 The Modified Kg "**"*" Fnzgy Mrim KarS Zr. 

Note that there is no value of x for* which f n achieves 

the value 1 because there is no solution to x = e x . Thus 
these numbers K n are not normalized, and therefore are not 
fuzzy numbers either. Since convexity holds it is easy to 
verify that a minor modification of the membership function 
satisfies the normality condition. 


561 








8 



In fact, let g n (x) = 1 + f n (X),so that if x * 0, then 
gn(0) *1. Thus, the element Z a = (x,g n (x)), x > 0 are convex 
normalized fuzzy sets, and are therefore fuzzy numbers. The 
letter Z is used to remind us that Prof. L. A. Zade first 
introduced the concept of a fuzzy number and its computa- 
tional application to fuzzy quantifiers in natural languages. 
See a detailed exposition in the text [4]. 

Can we define an ordering for the fuzzy numbers z a ? This 
can be done, besides using its maxima values, also by other 
methods which are based on the next definition. 

Definition 6 : The height of the fuzzy set S f on any interval 
[a,b] of the real line is denoted by h(S*) and it is defined 

by h(S f )= max{f(x): x £ [a,b]>. 

we can apply this definition to the modified Kauffman 
integers Z n because of the propositions above. We not only 

know what their maximum value is but we also know where that 
maximum is. Since the maximum need not equal one, the use of 
the height is an alternative method which might be preferred 
over the use of the maximum. We can order these integers 
according to their height or according to the maximum Mg. we 

know what this maximum M a by the previous propositions. Thus 
we have an easy computational method for their ranking. 

$11 Some Concluding Remarks 

It is a well-known fact that graph theory lends itself 
to applications. Often, we find computational techniques that 
are proposed without paying much attention to the coding, 
complexity, or storage difficulties. How do we store and 
manipulate graphs with so much information? If the coding 
language is Pascal, then it is recommended that each vertex 
is represented by a structure which contains information of 
the type: indegree, outdegree, etc. A similar structure is 
used for the set of links. All structures are linked to one 
another via linked lists. 

A procedure called putnetwork outputs the graph and all 
information about it. The determination of paths is simpli- 
fied because of recursion. The recursion is guaranteed to 
stop because there is a finite number of links; as branches 
are chosen, an array is passed down the recursion. Can we 
make use of fuzzy graphs? In previous work, we found 
applications for similarity relations which are important in 
building practical programs for fuzzy inferencing. we focus 
on what happens to the concept of similarity relations 
between distinct sets. The idea of similarity is no longer 
obvious. We find that k- partite graphs offer an alternative. 
An example of an application is included in previous work. 
Fuzzy bipartite graphs were a problem solving tool for 
J. Dockery and L. Me Allister (2c, 2d]. For example, in (2c], 
the authors focussed on what happens to the concept of 
similarity relations between distinct sets. The idea of 



562 






similarity is no longer obvious. They find that fu2zy 
kpartite graphs offer a pictorial and computational alter- 
native. An example of an application was included there. 



FIGURE: Stack ol Jk coma conupnwtki. lo 

). fa #<"-"> 


563 




REFERENCES 

[l Rlaran lFrank. R._NonnanR., Cartwright D. "StmcMal\Mels:.A,i introduction to th, ,h* n ™ 
of Directed Grap hs. Wiley .1965. — ■ ■ 

[2]- Me Allister. M.Luisa Nicosia. Cebulka Dickens Kathv Dickens S & rviUin 
James H StocAas//c Networks*: An Evaluation Technique and an Application* Proc Conf on 

ri y Qf Piasfrurgh- voi . 13 part 2. 1932. pp.775-778 ' ’ — 

C S uU “ K- Ds - W Editorial Pncesi of a BBimonthh : a Case 
Study , Me Allister MLuisa Nicosia LJMAP Journal, vol. 8, no 4 pp 1-6 1987 

I2h]- McAllister M Luisa Nicosia "A Measure of the Quality of Connection in a graph in 
Analysis of Information (J.C. Bezdek. Ed.), vol. 1. ch 9. CRC.. 1984. S P 

M 'c lli c N ‘ • & ^ k *D\J "Si'ntfart(y Relations " Proceedings of the loss 
^orkshop SanFrancisco State University. San Francisco. CA.. Ovchinnikov 'id 

"" trsec,ton 0rapts ’ A P Hjc..voi. 

m^ubois D. and PradeH. Fuzzv sets and Systems : Theory and their Annliciinn^ Academic 

£5]-Rosenfeld A. M F»crv Digital Topology" Info, and Control 40. m> 76-87 
6]-Rosenfeld A. "The/i^y Geometry of Image subsets". Univ. of Marvin 
Automation Ra a a arch. Comp .Sc, -TR1299, July 1983. ‘ 522 

[7] ; ShortUffe L, Buchanan B. "A model of inexact Reasoning in Medicine • Math. Biosri 

lV/P, pp. 351-379. ' 

[8] - S jngle K^ Gay-nor M.& Halpera E. "A ;; intelligent control strategy for computer 
consultation I EEE Trans. Pattern Analysis machine Intell. PAVTT-A 2 1984 do P9-136 P 

[10]- Weiss Kulikowski C. & Amarel . "A model based method for computet- tided decision 
making". Artificial intelligence . 11. 1978 pp us. 17? J F 

BedrosimS. pei/z/br/w^/off in afiizy set and the relation between Shannon 
' Pgc^Conf.onlnfo. Sa. and S ystems- . John Hopkins Unversitv vol 

1 /. iy&>. pp. 248-2^3. 

[12] -Zadeh LA. "Fuzzy Sets" Info, and Control. 8. 1965, pp. 338-353. 

[13] - ZadehLA. "Die role of Fuzzy Logic in the Management of Uncertainty m Expert Systems’ 

, I nt. Joum. Fuzzv Sets and Systems" . 1 1.1983 pp iqqJ?77 y 

[14] -Zadeh L. A." A Computational Theory of Dispositions" Proc. Int. Conf. nn Comoutatinnal 

Linguistics . Stanford Umv. ,1984. pp. 312-318. _ y . inpuianonai 

[15] Taijan R.E ."Data Structures and Network algorithms" CBMS 44 SI AM 1983 

n M 3: P05s ' b ‘ , ‘ s " c Pr ° duc "°" 

( la - ***• «• 

AUister M. N. & Dockery J.. Jung P. "Computing Similarity’ Proceedino^ nf th* r«-H 
IFSA Congress. Seattle. W 7 A. July 1989. ' ^ ~ rq 

[19] Carre' B. Graphs and Networks . Clarendon Press. Oxford. 1979 

“togoduction to the Theory of Fhztv Snhc>tg p— 

[21] Wilf H.S. Algonthm and Complexity . Prentice Hall.1987. 


564 



ia*- \ 





N9 8 “29578 


Incomplete Fuzzy Data Processing System Using Artificial Neural Network 





Marek J. Patyra 

Department of Computer Engineering 
University of Minnesota 
10 University Drive 
Duluth, MN 55812-2496 


Abstract 

In this paper, the implementation of a fuzzy data processing system using an artificial neural net- 
work (ANN) is discussed. The binary representation of fuzzy data is assumed, where die universe of 
discourse is decartelized into n equal intervals. The value of membership function is represented by a 
binary number It is proposed that fuzzy data processing be performed in two basic stages. It is pro- 
posed that incomplete fuzzy data processing be performed in two stages. The first stage performs the 
“ retrieval ” of incomplete fuzzy data, and the second stage performs the desired operation on the 
retrieved data. The method of incomplete fuzzy data retrieval is proposed based on the linear approx- 
imation of missing values of the membership function. The ANN implementation of the proposed sys- 
tem is presented. The system was computationally verified and showed a relatively small total error. 

1 Introduction 

Fuzzy data processing systems that perform fuzzy operations can be implemented using standard 
or specialized software, but the ultimate way is to implement them in hardware. In fuzzy data pro- 
cessing systems, the major functions are performed by fuzzy processing elements like Min, Max, 
Bounded or Absolute Difference, etc., which can be connected in different ways (for instance, Min- 
Max-Min), depending on the desired structure. Building fuzzy data processing systems is attractive; 
however, in practice (i.e, in control systems) many incoming data to the system arc incomplete (e.g., 
disturbed, noisy, or damaged). As a result, the output data generated by the system are wrong or con- 
tain an unacceptable errors that may cause a series of problems, especially in real critical applications. 

Signal processing using a fuzzy approach has become more attractive during the last few years, 
when fuzzy sets and tools have been applied successfully to a variety of tasks. These tasks cover dif- 
ferent areas of applications from speech and image processing to various pattern classifications [3]. 
Although the early stages of fuzzy signal processing mainly involving pattern recognition have been 
successfully developed, fuzzy methods for data processing (such as operations on various patterns) 
arc yet to be developed. In the previous paper [2], ANN (Artificial Neural Network) realization of the 
fuzzy operations addition, subtraction, multiplication, division, minimum, and maximum, using neu- 
ral networks, was studied. The conclusion of [2] indicates that the best results (in terms of average 
error) for fuzzy operations using ANN can be obtained when the operations are performed on nonde- 
generated fuzzy data. In contrast, the results of fuzzy operations using ANN performed on degener- 
ated fuzzy data contain relatively high error. To overcome these disadvantages, the two-stage fuzzy 
data processing system is proposed in the present paper. The first stage performs the incomplete fuzzy 
data retrieval, while the second stage produces the results of a desired fuzzy operation. 

The paper is organized in the following way. First the theoretical background for the retrieval of 
incomplete fuzzy data is given. Then the ANN realization of the retrieval stage is presented. The prac- 
tical example, discussed in Section 2, shows the two-stage fuzzy data processing system (preproces- 


565 



s 



sor performing fuzzy data retrieval and processor performing one of the discussed fuzzy operations). 
Finally, the simulation results for the system-are presented, followed by the conclusions. 

2 Fuzzy data retrieval 

The discussion of the fuzzy data retrieval begins with the definition of the fuzzy number [1]. 
Definition 1. A fuzzy number X={Xj} is defined over a normalized set A on the real line R such 
that: 

3xj€ R, sup M x iH (EQ 1) 

The |i A (.) denotes a membership function of in A and the xq referred to as the mean value of A if 

MxoH- , 

Assuming the discrete representation of the fuzzy number, the ordinary fuzzy number* can be 
described in the following way: 

Definition 1A. Any fuzzy number X can be described in a finite domain {xj}, by 


x-X ^ 

; » i *• 


(EQ 2) 


where i = 1,..., n and n defines the number of equal intervals into which the fuzzy number X is dis- 
cretized and X denotes the union operation. 

Based on the Definition 1A, there must be a mean value for the ordinary fuzzy number, and the 
EQ2 can be rewritten separately for the left and right intervals around xq as follows: 


r. trt X • 


(assuming x k+1 =x 0 ).Such a representation is called the “discrete representation” of fuzzy number. 
The special case of discrete representation, digital representation, is commonly used in most current 
applications of a fuzzy technology. Hence, universe of discourse is discretized into n intervals, each 
of which will be called “bit” by analogy with a digital representation of a number. However, any 
value from the interval [0,11 can be assigned to each bit of a digital fuzzy number. Additionally, it is 
assumed that the unimodal fuzzy numbers are discussed in this paper. 

Definition 2. The degenerated fuzzy number T is the number with missing membership values Jly 
of one or more bit positions 1 2 (Fig. 1). 



Figure 1. Digital representation of ordinary fuzzy number (a), and degenerated fuzzy number 

It is assumed in this paper that the discussed fuzzy numbers are unimodal. Let us now consider the 
degenerated fuzzy number Y' and their retrieval system. 


1. The term ordinary fuzzy number is used for fuzzy numbers in the sense of Definition 1. 

2. A special ra te of the degenerated fuzzy number with missing membership function values for all bits are not discussed in this paper. 


566 




Definition 3. The fuzzy number retrieval system (see also [6]) is defined by a triplet: (F, F\ p), 
where F is a degenerated fuzzy number, Y" is a retrieved fuzzy number, and p is the retrieval func- 
tion: 


P : P A Cy’) -> |iA(y") :V y'e Y and y" e Y" 


(EQ4) 


where — > represents a mapping relation 

Hereafter we use a simplified notation: p A (y) = Py.represeming the membership function value of 
Y at the bit position i. 

Definition 4. The fuzzy data retrieval function is defined by 


P(P r ) = n r sji r (EQ5) 

where Y is suppose to be an original fuzzy number which is free of missing bits (see Fig 1). 


1 J 


0 






P y"\ 
1 


P(Hy )= l 1 y 


► 

l 0 


di 


II II II 11.1 




yo 


yi 


yo 


Figure 2. Interpretation of the definition of fuzzy retrieval system. Y represents the original 

number. 

The characteristic of the retrieval function depends on the particular application. The implementation 
of a linear approximation technique, which seems to be good enough for most practical applications 
of fuzzy logic to the fuzzy data retrieval, is described below. In the process of approximation of the 
missing values for the membership function, two basic cases should be distinguished. 

Case 1. 

The membership function missing values correspond to the bits which arc not: first {yj}, last {y„} 
nor mean (yo). The number of bits with missing values in the left or right intervals can be arbitrary. In 
this case, the retrieval function simply extrapolates the membership function missing values based on 
the existing nearest values: 


PtPy) 




+y. 


(EQ6) 


Assuming that there arc k missing bits, which start from sr/i bit, the membership function values can 
be obtained by incorporating y s , y s+ i,..., y s+ k into EQ7. Note that, in such a case, it is necessary to 
approximate the membership function to the nearest available level of quantization. Let us consider 
the simple example where the number of bits with the missing membership function is equal to one 
(m=l). The bit number with the missing membership function in the degenerated fuzzy number is 
denoted by k. In such a case the membership function for the bit k can be approximated by 1 : 




p 


>»., 'j 

♦i y*-i > 


(EQ7) 


1. In thii context tbe int function means the evaluation to the nearest quantization level. 


567 



r~ 


If for any i * k the \i y * 1 .then the a, is set to 1 (see Fig. 1 .a). 

Case 2. 

The membership function missing values correspond to the bits which are: first {yi}, last {y„} or 
mean {y 0 >, or contain these bits. In this case, the retrieval function simply extrapolates the missing 
membership values. 



Figure 3. Example of membership function retrieval by linear approximation for a single bit 
(a), and for three bits missing (b). Black square represents known value and empty square 

represents retrieved value. 

Let us consider the example where only single bits {y 1 }, {y,,}, or {y 0 } are missing. In such a case 
the missing membership function values can be calculated using the formula given in EQ8. The only 
difference is that instead of calculating the membership function for the center bit, the one for the 
boundary bit is calculated. 

The case where several bits have missing membership function values is not trivial and needs more 
discussion. It is proposed that the missing membership function values for the boundary bits (for left 
and right intervals) can be evaluated using the linear (prediction) function calculated based on the 
membership function values for the last two boundary bits 1 . Assume that there are two linear func- 
tions calculated for left and right intervals with the intersection point below 1 (see Fig.4 a). In such a 
case the mean value of a membership function is approximated to the nearest neighbor for any miss- 
ing values of yj and finally for the mean value |i(yo) is set to 1: p(yo) = 1. 



Figure 4. Example of membership function retrieval by linear approximation for several bits 
including the mean value: intersection point below 1 (a), intersection point above 1 (b). 

Then new approximation functions are obtained based on the parameters of point (1, yo), and last 
left lu(yn)» ynl. and first right [H(yfr)- yfrl bits (see Fig.4 a). Assume that there are two linear functions 
calcula t ed for left and right intervals with the intersection point above or equal to 1 (see Rg.4 b). In 


1. The procedure for determining missing values for membership function is discussed later in this section. 


568 




m rnmt 


such a case the x coordinate for the mean value is approximate to the nearest v- netohiw 

and «y 0 )is ant to 1: H(yo) = 1. Then new uppcoximadon functions are oh,utia 

ters of point (1, y 0 ), last left fji(yu), yjj], and first right (|i(y fr ), y fr ] bits (see Fie 3 h) With thr lL - r 

Case 2a. 

The subcase 2a relates to the situation when the unimodal fuzzy number miaht have tran*™;,! : 
membership function. If such a case may origin when approSion 
upon the exrsung values) clamp” more then one quantization intervals (see Fig 5) In this case the 
membetship funcoons values can be simply approximate by setting their values to one * 


Figure 



5. Example of membership function retrieval for the fuzzy number with 

membership function. 


trapezoidal 


3 Linear approximation procedure and neural network 

Pr Ti in [ , 51 ’ any conlinuous function can be uniformly approximated by a continuous 
NN using one hidden layer and with arbitrary continuous nondecreasing function Such characters 
tic can be utilize to a task of the approximation of missing membership function values by the linear 
combinauon of existing values. Therefore, the linear approximation procedL shoSd be 
ootam the training data set for ANN. Note that only selected memberahip funcuTn S (o- 

evaluauon of missing values. The following architecture was assumed for the ANN implementation 

zgzs c °" sis * of - inpui " euro " <»• ■ “*• — r°; 


Input 

Layer 


Hidden 

Layer 


Output 

Layer 



Figure 6. Structure of the ANN. 


569 




Each layer is completely connected, meaning that each neuron from a given layer is connected to 
all neurons of the next layer, A unique weight is associated with each connection. It was assumed that 
for fuzzy data retrieval the number of neurons in each layer equals to the number of bits in fuzzy 
number. Figure 6 illustrates the details of the discussed ANN configuration. 

Now, the problem can be reformulated into whether it is possible to obtain sets of weights which 
minimize the total error of the linear combination of all existing membership function values with 
respect to the linear combination procedure. In the proposed method the output of the ANN which 
represents a retrieved membership function value uf. for i-th bit can de described by: 

M? = ffX^i/fX***"’/*)) (EQ8) 

7 • 1 \k * 1 

where g(.) denotes activation function, p denotes p-th input pattern, denotes the membership func- 
tion value, Vy represents weigh between i-th output layer element and j-*h hidden layer element, 
while Wjj. represents weigh between j-th hidden layer element and k-th input layer element (Fig. 6). 
The proposed linear approximation procedure used for i-th bit membership function value retrieval 
can be formally described as: 


«*? = x 

;.;-l 

where a is an arbitrary coefficient (for i*j). Then the typical error function can be given by: 


(EQ9) 


2S = I 


r M N i - M 

L 7» l l ' j *>- 1 


(HQ 10) 


This is absolutely continuous, differentiable function of weights, so matrices w and V can be found 
minimizing the error using backpropagation method. For hidden-to-output and for input-to-hidden 
connections the steepest descent rule (which is a base for backpropagation method) gives: 


9e 

5vT. 

V 


= X 


«( X S y a( X X ivpltf ( X X X 

L 7*1 V *«l n ;*r-I - 7* ■ ' V i«! 


(HQ 11) 


J_ e = ?Lil = y,. 

dw a av* dw jk 7 


^X^i/fX^))- X (^l^(X^(X^)V(X^k (EQ 12 ) 

L 1 // y.i-l J 7 ■ 1 v Jt«l v *» 1 • 


Therefore, the system proposed before [2] was extended by the preprocessing stage incorporating 
ANN for degenerated fuzzy numbers retrieval followed by the Fuzzy Data Bus and the system for 
realization of fuzzy operation. The result also shows that it is not necessary to implement a special 
type of ANN, like that one suggested in [7], in order to obtain a good approximation of a fuzzy oper- 
ation supplemented with the retrieval stage. 


4 Computer simulation results 

The data for training the retrieval ANN was prepared incorporating the linear approximation pro- 
cedure in such a way that the degenerated fuzzy numbers were set to the input pattern and retrieved 
fuzzy numbers were set to the output pattern. Up to 70% of bits with missing membership functions 
was included in the set of 1024 degenerated fuzzy numbers (32 bits each). The PlaNet [4] simulator 
was used to train (with backpropagation procedure implemented for updating the weights) the ANN 
to the moment when average error (for all patterns) was less than 0.0001. 


570 



da Slf 0Ce f in ? SyStem Consisls of two the fi«t stage performing fuzzy data 
retrieval (described m Section 3), and the second stage performing six basic furrv 

(described in [2]). The original architecture of the system described i n S [2] is illustrated^ 

ssrssr re,rieval “**• ““ ^ "^ a - ~ “ “ 2u £ 

The main goal of the application of this stage is to obtain retrieved fuzzv numbers avmi.hu «« 

222r ***** furth r p r ssing - Note 

addition) as well as complex fuzzy operaUons such as inference, so such a system perfecdv mJh« 
Jl e . g ^ r f fu “ y m °deling requirements. In the presented system the second st^f networks aS 
^ rf0rm - add , Ul T’ subtrjcll °n. multiplication, division, maximum and minimum^) be 
W PreV1 ° US,y d r imd SyStCm) - Tab, ° 1 Summarizes ««** (average err“Sied 




ANNs perfor 

Pattern Addition 
Training 0.000353 
Testing (d) 0.004032 


Subtraction 
0.000335 
0.005630 


Multiplication Division 
0.000395 0.000219 

0.009541 0.007680 


Maximum 

0.000382 

0.006516 


Minimum 

0.000331 

0.005766 


Input 1 


Input 2 



Figure 7. A ^tecture of the original fuzzy data processing system based on the ANN 121 f« 
denotes the different data bus width as a result of fuzzy operations). ’ 

fiv^fn^n thC T rS oblained rrom testing patterns including degenerated fuzzy numbers are 

ctalw* U greater than on gmal training errors. If we include in the simulation the ANN retrieval 

SIZ SeT r lhan 0 ■ ,K “ 4, ■ th “ ,hu 0tai " ed - v i— 


571 








Input 1 


Input 2 



Addition Network Subtraction Network Minimum Network 


Figure 8. General architecture or fuzzy data processing system using ANNs. 

TABLE 2. Comparison of average errors for testing (degenerated) pattern obtained from the proposed fuzzy 
data processing system including fuzzy data retrieval stage. Second-stage ANNs perform: addition, 
subtraction, multiplication, division, maximum, and minimum. 

Pattern Addition Subtraction Multiplication Division Maximum Minimum 

Testing 0.000643 0.000761 0.000577 0.000867 0.000742 0.000522 

Figure 4 illustrates a fragment of the fuzzy data processing system, extracted from the original 
design (see Fig. 3), including the retrieval preprocessor. The values of membership function are coded 
in forms of sequence of squares. The area of a single square for a specific bit relates to the member- 
ship function value in such a way the largest square represents 1 and the smallest 0.1 (the empty place 
indicates 0). Two 32 bits long fuzzy numbers are set to Inputl and Input2 (data on Inputl is degener- 
ated: missing membership function values for two bits). Then they are processed in the retrieval ANN 
(Hiddenl(32b) and Hidden2(32b)). Finally, the retrieved numbers are displayed in Hidden3 layer 
(compare Input l&Input 2 and Hidden3(64b)). Then, these two ordinary numbers (first retrieved and 
second original) are processed in the subsequent network (Hiddcn4 (64b), HiddenS (64b)), producing 
the result of operation (in this case, addition) at the Output (64b) [4]. 


572 










MM* Nn*M NWl-K MM** NNdM4 MM-M M*W4 NM l** 



Figure 9. Example of ^ e y ^ tionJCA ttbe top? the trace of average error 

preprocessor and the * shown) ' 


for training this part of 


5 Conclusions ^ a ; ,t a nrocessing system using artificial neural networks 

In this paper the implemrotauon ot a J lcsdng patterns containing degener- 

is described. As it was verified m J 21 , the a e ^ error for the normal testing data. In 

ated data were about two to five times g . multiplication, fuzzy division, maximum 

crier ,0 support fuzzy 74^ -T <*voted 

and minimum for the degent-ra JL . i[U0 lhc ( u2z y Cam processing system. The retnetvsl 
tu»« desiened, trauied, and me p . 1 ,. ..victim? data for the incomplete 


retrieval was aesigircu, — . • , _ dicl : on 0 f the existing aaia iw 

process was based on the lines, aPP™“«^ L accuracy (up to mu times) of the 

fuzzy numbers. Seeh an numbers; however, with the inetease of 

results of operations performing on ^ increases. The results of testing of the 

missing membership tunc. function values are missing, the average 
proposed system show that when up - c U lrom 30% to 50%, the error is one order of 

error slightly increases (up to fiv ^f s) 75 ^ n ^ cn J cm be ten to one hundred times greater than 
magnitude higher. Finally, from 50% to 75^ ^ architecture for fuzzy data processing 

^ 

damaged fuLy data by M g* real-time control constoing 

One should also staled the main advantage W wjlh hanging 0 f the 

dynamic, time-variant system. 




6 References 

[1] A. Kaufmann and M. Gupta, Introduction to Fuzzy Arithmetic. Theory and Applications, Van 
Nostrand Reinhold Company, New York, 1985. 

[2] M. J. Patyra, “Implementation of Fuzzy Operations with Neural Networks”, IEEE Conference 
"Fuzzy and Neural Systems and Vehicle Applications’ 91” , Tokyo, Japan, November 1991. 

[3] H.Takagi and I. Hayashi, “NN Driven Fuzzy Reasoning”, International Journal of Approximate 
Reasoning, pp. 191-212 (8), 1991. 

[4] Y. Miyata, “PlaNet User’s Guide”, Computer Science Department, University of Colorado, Boul- 
der, CO, 1991. 

[5] G. Cybenko, Approximation by Superposition of a Sigmoid Function, Mathematics of Control, 
Signals and Systems, pp. 303-314, (2), 1989. 

[6] S. Miyamoto, “Fuzzy Sets in Information Retrieval and Cluster Analysis” Kluwer Academic 

Press, Dordrecht, Netherlands, 1990. - - 

[7] S. Horikawa et al., “A Study on Fuzzy Modeling Using Fuzzy Neural Networks”, Proc. of Inter- 
national Fuzzy Engineering Conference, IFES ‘91, pp. 562-573, Yokohama, Japan, November 
1991, 


574 




NCLAS 


5 : 3 -£3 

/C,/ 3*3 

Stochastic 



N88-29579 

architecture for Hopfield 
neural nets 


Sandy PAVEL 

Polytechnical Institute of Iasi. 
Bd. Copou, nr. 11. Iasi, ROMANIA 


Abstract 

An expandable stochastic digital architecture for recurrent 
(Hopfield like) neural networks is proposed. The main features and 
(Hopfieia liRe; stochastic processing are presented. The 

bMic P' 1 "' 1 **" ^ on . chip with n full, 

is provided. 

Introduction 

The analog implementation of Hopfield neural network is of 
* i 5 i. Due to the great complexity of the 

actual int ***® and to the presence of parasitic coupling path, 
interconnections ks are P prone to follow incorrect trajectory 

analog re This reduces by an order of magnitude the number 

u ££ can be built ok a chip. In the same time, large 

aonlieations require to interconnect many such chips. Due to the 
appiications q h distort the analog signals, this become 

pa task. A digital stochastic architecture avoids 

another difficul . als are VO re easily passed between 

these problems. Here ^ noise. By using a time-multiplexed 

chips and are J L *« reduced and so leads to 

« • ail t i -chip systems Recurrent networks operate by 

flexibl ii changes into the neural state. This integrative 

process has a loipa.s filtering effect, reducing also the inherent 

>t ° <!h ;^tL?no C *« i o”.?v°i t .'w o'/ the n.thods tor infatuation digital 
a p r r'ovl P d”nT tte 

chip, and 5 d '\, t he syste. expandability, reliability and 

^configurability are treated along with discussion, on execution 
speed. 

Digital stochastic encoding of information 

* «fnchastic encoder is basically a tunable random pulse 
t r The probability of occurrence for a pulse, i.e. the mean 

;r;n ii ssSuedV th. input « b . .nc.d.d in .uch . ««,. 

that 


575 



P(x) - S„ / S au 


U) 


In equation (1) p(x) is the probability that, the binary random 
pulse train assumes a value of 1 at a moment. S u is the value to 
be encoded, and S.. t represents the maximum possible value for the 
signal S. Thus the probability of a pulse in the pulse train is 
proportional to the normalized input signal. The basic circuit for 
encoding a digital signal (number) into a random pulse train with 
appropriate probability is shown in figure 1. 

The number N is compared with a random number R, uniformly 
distributed over [Rmin.Rmax]. The output of the comparator will 
pulse if R < H, crating a stochastic firing signal X, whose mean is 
proportional to N provided H is in [Rmin.Rmax]. The stochastic 
encoding represents an analog signal mapping. By using non-weighted 
bits in a code of infinite word length, it is extremely noise- 
proof. In the same time it has an adaptive accuracy. As information 
is recovered through a pulse counting process, one can at any 
moment decide for a fast but imprecise or for a slow but accurate 
response. The computations are easily performed on such signals 
using space- and speed-efficient digital logic [4]. 

Arithmetic computing elements 

The basic arithmetic computing elements used in this approach 
are* multiplication, counting/accumulation and linear/nonlinear 
transfer functions. 

For example, if two statistically independent binary random 
pulses, x and y. with probabilities p(x) and p(y) are ANDed , the 
result has the probability! 

p(r) - p(x) AND p(y) - p(x)*p(y) (2) 

That is, a multiplier in a stochastic architecture is a simple AHD 
gate . 

The easiest way to perform the accumulation function, which is 
equivalent with an integration operation, is to use a counter. For 
a neural processor, this will count the number of pulses which 
results from the multiplying operation between weights and neural 
outputs. 

In simulating neural nets, the most time consuming operation 
is to apply the transfer function, which usually is a nonlinear 
one, to the neural state and obtain the neural output response. The 
use of stochastic encoding provide significant time gain in 
performing linear/nonlinear transfer functions. Returning to the 
digital-to-stochastic converter, one can see that the mean value of 
the output binary pulse train ist 

<x> - Pr [R < N} <3> 

which is the cumulative distribution function (CDF) of R. If M is 
the value of the neural state u(t). and x is the neural output 
pulse train, then the transfer function can be modified adjusting 
the probability distribution fraction (PDF) of the random number 
generator (see figure l.a.b.c). While a uniform PDF gives a linear 
transfer function with hard limits, a PDF like in figure l.c gives 
the more used sigmoidal transfer function. 


576 


The stochastic architecture 


The equation governing the integration of charge in the 
Hopfield network is< 

“ZTF * f, * T,lVl * I > ) 

If the time slice dt is much smaller than the main integration 
period, so that the capacitor voltages do not change too much, 
equation (4) can be approximated byt 


u,lndt*dt) * Uj(ndt) ♦ T t) V t {ndt)dt 


Therefore, in dt time, for each neuron j, the summation of only one 
product T t jV|, to the state u t , is performed. 

The architecture based on (5) has H neurons operating in 
parallel, at the clock frequency f,, and an execution speed of H*f, 
connections per second. 

As long as the time-multiplexing implied by (S) has not 
prevented proper convergence or caused fault operation f l J , 
equation (6) also holds true: 

• t 

u,(ndfdt) * Uj ( ndi) ♦ J ^ ( T r . (l . ulf J y nU . IU ( ndt) ) eft (6) 


where n*b < M. Thus, in dt tine the summation of (n*b) products 
(T||V|), to the state u,. is performed. In this way the state update 
speed can be enhanced n*b times. 

The digital stochastic architecture proposed in this paper is 
based on equation 6. The basic building block is a chip with n 
fully interconnected neurons, operating in a pipe-line, bit serial 
manner on words of b bits length. The chip is depicted in figure 2. 
There are two types of processors! Synaptic Processors (SP) and 
Neural Processors (HP). Each synaptic processor, SP U , performs one 
product and two summations in parallel in a clock period. It 
contains a comparator Comp and a counter Count which are organized 
on bit serial, pipe-line structure (figure 3). The weight shift 
register (HSR) has a set of K registers on b bits word length. 
Skewing the outputs of the HSR and of the random number generator 
(RNG), every clock period T«, a pulse with probability proportional 
to | T l} |. results at the output of the comparator. The weight 
multiplication is performed by ANDing this signal with the neural 
output value received on V t line. The result is added or 
substracted, to/ from Count if T|j is positive respective negative. 
This represents the summation over r index in equation 6. After b 
clock periods the content of Count is trasfered to the shift 
register (SR), and Count is reset. The value in SR is added 
serially with the value u*j(i-l), resulting u*,(i). The value u',(i- 
1) is a partial sum of the neural state Uj(t) computed in the 
previous cycle in the (i-l)th stage. 


577 



The neural processor HPj has a bit serial adder, a digital-to- 
stochastic converter and a shift register which store the neural 
state value (figure 4). Every b clock periods, a new partial sub 
u'j{n) is added to the neural state u } (t). This process represents, 
the summation over h index in equation (6). 


The lines Vi ,i-l,n, are internally connected to SPij, j-l.n, 
and are also used to interconnect chips forming large systems. 
External logic is needed for recovering the mean value of the 
signals which are the neural activity information. 

Expandability , fault tolerance and speed 

In large applications, which require a grate number of 
neurons, say N. K chips must be used (k - [N/n] + 1). By providing 
the SP with Km weight registers, the neural network will be 
extensible to M a(l neural processors. K.„ is also the maximum number 
of chips that can be connected in a system. Such a system is 
organized around a n bits bus. In a period T,. only one chip puts 
on the bus the output values of its neural processors. All chips 

r ead these values, performing the state update function. This 
is done once, for each chip, after what the cycle is repeated. 

This type of architecture, implying the connection to a unic 
bus, allows the number of neural processors to easily be changed by 
inserting or removing chips. In the same time, if a chip contains 
too many defective elements it may be bypassed by desselection. In 
this moment, an idle, unused, chip will replace the defective one. 

The weight update speed can be evaluated taking into account 
that the total number of neurons in a system is * - K*n, and the 
number of weights is W - (k*n) J . In each chip, for each neuron. 
( n «b) connections are computed in b clock periods (t«), the speed 
being: 


S - (K'n 1 ) /T» » n*H*f, connections/second (7) 

For example, a system with n ■ 100, K • 10 and f* ■ 100 MHz, 
has the execution speed S - 10“ connections/second. This value is 
well above any reported implementations. 

A time-multiplexed architecture performance parameter, is the 
neural state update speed (MUS). HUS is the number of products 
(TiiVi) added to the neural state in unit time. For the previous 
reported implementation 11], the HUS was equal to f, values per 
second. In our architecture HUS reach (n*f») values per second. 

Conclusions 

A stochastic digital architecture for networks of the 
j-gcupygut type has been described in this paper. The low— pass 
filtering, integrative nature of these networks was well-suited for 
an implementation based on stochastic techniques built from 
entirely digital circuitry. The combination of all-digital signals 
and pipe-line, bit serial processors led to a system which could be 
spread across multiple chips. The reduced interconnectivity of the 
VLSI system made dynamic reconfigurability and fault tolerance easy 
to achieve. 


578 



References 

„„ ai "A Diaital Architecture Eaploying 

Acadenic Publishers. 

3. HOPFIELU. 

P ^»at “nal Ac.U.ay of Sci.nc.s USA SI. Kay »««. 

pp 3088-92. 

«. ROBERT HASSER-Hhy dO -Uf.l a.tWOrR t Jj B "S 

- IV-^r °- Han!ke - Ms - pp 23 • 

Elsevier Science publishers B.V..1990. 

S. STUART MACKIE.al.al ■ " 


a. 


p.d.f. 

1 


A 





Figure 1 


Digital to Stochastic Converter 





Figure 3 Synaptic Processor 


Figure 4 Neural 

Processor 


580 











5 



N93-29580 

HIERARCHICAL MODEL OF MATCHING 


Witold Pedrycz 

Deptof Electrical & Computer Engineering 
University of Manitoba 
Winnipeg, Manitoba, Canada R3T2N2 
pedrycz@eeserv.ee.umanitoba.ca 


Eugene Rovema 
Dentof Computer Science 
Glendon College, York University 
Toronto, Ontario, Canada M4N3M6 
Roventa@venus.Yoikn.ca 


Abstract The issue of matching two fuzzy sets becomes an essential design aspect of many 
algorithms including fuzzy controllers, pattern classifiers, knowledge-based systems, 
etc. This paper introduces a new model of matching. Its principal features involve: (i) 
matching carried out with respect to the grades of membership of fuzzy sets as well as 
some functionals defined on them (like energy, entropy, transom), (ii) concepts of 
hierarchies in the matching model leading to a straightforward distinction between 
“local” and “global” levels of matching; (iii) a distributed character of the model realized 
as a logic-based neural network. 


Keywords matching, hierarchical model, local and global level of matching, logic-based neural 
networks 


l.Introduction 

Defining and handling problems of matching fuzzy sets (linguistic quantities) has become a 
domain of intensive research dating from the very emergence of fuzzy sets.The abundance of the 
matching methods available nowadays is evident. Different approaches stemming from measuring 
distances between membership functions, calculating possibility and necessitiy measures, using 
fuzzy measures and integrals, to name a few among them, give a good impression about their 
variety, cf. [4], [5], [6]. 

The proposed model embraces three new features being nonexistent or not fully addressed in 
the framework of the previous methodology. They are, however, important in dealing with fuzzy 
sets. One should stress that fuzzy sets form a collection of objects belonging to a given category to 

a certain degree. As such the grade of membership atx,, e X does not exist in isolation and is 
usually related ( affected) by otter membership values that the fuzzy set takes on in the neighbourhood 
of this pointThis fact implies that this phenomenon should have a direct impact on the development 
of matching procedures. 

Firstly, the two levels of hierarchy at which the matching process is carried out are distinguished: 
(i) a local level of matching dealing with the grades of membership of two fuzzy sets pertaining to 
die same element of the universe of discourse, and (ii) a global level of matching where all those 
“local” characteristics are summarized (aggregated). 

Secondly, the local level of matching should also handle several criteria of matching not being 
exclusively restricted to the analysis of the grades of membership of the objects. Some other 
functionals defined over the membership values (like entropy,energy or transom) might be worth 
considering in this context 

The discussed model of matching is fully distributed and utilizes logic neurons [8], [9] to 
constructively accomplish matching at the indicated levels. 

. In the remainder of the paper we will consider fuzzy sets defined in a finite universe of 
discourse, say X = {x„ .... x„}. The discussion regarding the local level of matching will be 
coveted in Section 2. In Section 3 we will proceed with the global level of matchin g showing hpw 


581 



different elements of X interact within the process of matching. The learning algorithm leading to 
parametric adjustments of the connections in the model will be studied in Section 4. 

2. The pointwise level of matching of fuzzy sets 

Let us discuss two grades of membership at a certain element of X, say a = A(x,), b = B(x t ) 
where A and B are fuzzy sets. The general questionarises: why are these fuzzy sets similar or what 
makes them different? First of all it is likely that a very preliminary answer to this problem can be 
formulated by studying the corresponding values of the membership functions of A and B.They 
are usually deemed essential in expressing similarity between fuzzy sets. 

One among existing alternatives useful in describing similarity of fuzzy sets could be the use of the 
following equality index cf. [7]: 

a5b<->a = b = i-[(a<pb)A(b<pa) + (a<j>b)A(b<j>a)] (1) 

where the (^operation (pseudocomplement) is defined as follows: 

a<pb = sup{ce [0,1] I ate < b} 

and “t” denotes a triangular norm while “A“stands for minimum. The equality index attains 1 if 
and only if the arguments are equal, a = b. It should be stressed that the values produced by the 
equality index are not context sensiiive,viz.this index produces the same result once a mutual 

position of the arguments is the same.This means, for instance, that 0.1 a 0.1 = 1 and 0.9 a 0.9 = 
1. This could cause undesired lack of discriminatory properties of this definition. On the contrary, 
it could be propitious to discriminate between situations where matching achieved for the higher 
membership values such as 0.9 and 0.8 is more significant than the one reported for the lower 
values ,say 0.05 and 0.2. One of feasible solutions to this deficiency would be to perform (1) not 
only on the membership grades but also on their functionals. We will study three well-known 
families of these membership functionals: 

- energy-type functionals [1], [2] compute values of a certain monotonically nondecreasing 
function defined over the original membership values,say\y,: [0,1] -» [0,1] where xy, is a 
monotonically nondecreasing function. For instance, one can refer to polynomial-type 
energy functionals of the form xjr, (u) = u p , p > 1 ; 

- entropy-type functionals, cf. [1] [2], are defined as mappings xjr 2 : [0,1] -» [0,1] such that 

(i) \|t 2 is monotonically increasing over [0, 1/2] and 

(ii) y 2 is monotonically decreasing over [1/2, 1]. 

Moreover one assumes that the functionals attain maximum at 1/2, y 2 (1/2)=1. 

- transom-like functionals [10], [1 1] remind the functionals of the first class .The modification 
is such that low and high grades of membership (i.e., the values lying around 0 and 1) are 

ignored. One can characterize these functionals as y 3 : [0,1] -> [0,1], such that 

\|t 3 (u) = 0, if u £ a, 

V 3 (ii) = \jf t (u), if u e (a, p), and 

\|t 3 (u) = 0, if u £ P 

where a and P are threshold levels. 

The examples of these functionals expressed with the aid of linear or piecewise linear 
relationships are included in Fig. 1. 


582 


/ 








Pig. 1. Examples of functionals Vi.Y 2 *V 3 
Here we have: 

Vi(w) = u, u e [0, 1] 

and Vi<U) = 2U ’ Ue I0 ’ 1/21 and V 2 (u) = 2(1 - u). u e [1/2, 1] 

V 3 (u) = v 1 (u)[l(u-a)-l(u-p)] 

(aiiS^S summarized 

where w* i - 1 2 (2) 

Vi [A(xj)]syj [B ( Xj)] 





563 



The appropriate values of the connections can be derived through supervised learning. We will 
discuss this issue in great detail in Section 4. 

In order to emphasize the local character of matching and explicitly indicate the arguments 
standing there (elements of X ) we will introduce a two-vanable predicate MATCH JXJCALXx,, Xj) 
which is defined as follows, - 

MATCHLOCAL^, Xj ) = [(^(Afa)) s y^xj))) OR wj AND ... _ _ 

AND [(v^xO) 2 Vj^xj))) OR w p ] 

i,j = l,2,...,n 

3.Global level of matching 

When it comes to the global level of matching involving all the elements (pairs) for which the local 
matching operation has been accomplished we can think of the following model, 

y = OR [MATCHJLOCAL (x,, Xj ) AND v s ] (3) 

i,j = 1,2 n 

where v lJt i,j e [0,1] are connections modelling the influence the results of the local matching have 
on the global level. 



Fig.3.0vera!l matching model 


The complete structure composed of the processing units described by (3) is given in Figure 
3. The grid of points shown there is formed by considering a Cartesian product of X ’s . 

The entire model can be viewed as a heterogeneous OR-AND logic neural network, see [9]. 
The compact notation applied to it will then look like this, 

y = MATCH(A, B) = OR [MATCHLOCAL(x it Xj ) AND v y ] (4) 

(x,. xj)eX*X 

where the OR operation pertains to the arguments of the Cartesian product XxX. The AND and 




^ "spcclively,. TO, tap B« ihe 

MATCH(A3) = .. = S 2 p [MATCH_LOCAL(xj, xj) t vyj - (5) 

MATCH_LOCAL(x if xj)) = ^T ^ [Vi(A(xj)> s v,(B(Xj))) s wj (6) 

Expressions (5) - (6) form a basic distributed and hierarchical model of 

«-» This 

4. Specializing the model of matching- a parametric learning in the network 

already enveloped jow one has to detemtoe its 
• • r y |J» ^ l t 2,.,.p End v — [VjJ ij = l y 2 v ...4i .This is CErncd out on the hscic nf 

a traumgset ofdata. It consists of pairs of fuzzy sets A k , B, and assort^ results of matching 
reported there. Denote them by t* k = 1,2,. „,N. Usually we can concentrate on a simple scenario 

m winch t k e {0,1 }. If ^ = 0 the corresponding pair (A k , B k ) delineates two fuzzv sets which are 
dtfferent On the other hand, if L = 1. A, and B* L viewed £s behJS ^ “* 

The learning (adjustments) of the connections is completed in the supervised mode One 
presemte A k and B k to the model and compares the obtained result MATCH(A„ B ) with t • if these 
are different then w and v have to be modified to reduce this difference A nmvmLnt ” these 
guiding the.dj,Kta«, B 


ia 

Q = Z [t k - MATCH(A k , B k )] 2 


k«l 


(7) 


^name^ ndard Newton ' like method “ exploited to produce the required modifications of w and 


w = w - TidQflw 


v = v - h3Q/9v 


where q is a learning rate controlling a speed of changes of w and v. 

The derivatives are computed in a standard way, 

^=-2 f W - MATCH<A*. B k )] ^ATCHjA,, Bt ) 


3v 


ij k=l 


avy 


0MATCH(A k , B k ) d r 

— „ . s u_ [MATCHJ.OCAL(xu. „) • 


UVii 


0Vij 


[[MATCH.LOCAL ( Xi , xj) tvyJsS,] 



1 


Sg„tXi For^c<Uuon S w w= derive simiiariy. 


N 3MATCH(A k ,Bk) 

= -2X lie - MATCH(Ak, B t > — 

3wi k=l 1 


3Q 


and 


3MATCH(A t , Bt) V aMATCH(A t , Bt ) v ^TCH.LOCALQg^ 

^ i,j=U,....n 9MATCH_LOCAL (x* Xj) 


3wj 


Subsequently we conclude 

3MATCH(A k , BQ 91 S 2 tMATCH_LQC AL(xj, xj) tvjj] 
3MATCH_LOCAL (xj, Xj) 3MATCH_LOCAL (x^ ) 


with 


and finally 


§ s [MATCH.LCKALCxiivXjOtvuji] 

2 il, ji. iW. 


3MATCH_LOCAL(xj, Xj) _ s yi(B(Xj))) swH 

3wj 

Ti = T [Vu(A(xi,)) 3 Vu(B(zi,))) sw iii] 

The foBowi „g example serves as an illusttadve mtuerial showing how the hierarchical model of 
iinkino ran he developed 


lilt 1 — - 

matching can be developed 
Example The fuzzy sets used in this simulaiion experimenl am given below: 


r Xl 

At =fo.95 
A 2 =fo.l 
A 3 =[0.25 
A4=[0.45 


Xj 

0.71 

0.3 

0.4 

1.0 


0.2 

0.58 

1.0 

0.75 


*1 

Bi =[0.4 
B 2 =[0.15 
B 3 = [0.5 

b 4 =[i.o 


*2 

0.9 

1.0 

1.0 

0.2 


0.05 

0.01 

0.7 

0.45 


Matching 

1.0 

0.0 

1.0 

0.0 


— LV/.*T<ri/ * * v 

This synthede data set includes some pairs of fu^y sets exhibiting equality (t,- 1 ) dMferesjce 
betwe^A, and B,’s (t t =0). We will consider the fnnctionaU y„ V, and V, skown m g. ( 

S^stS“wtt teule' of (5) - (6). dm performance index is give, b, <7) mt 
n = 0.2.THe process of learning is visualized uiRgA 
’Tbe results generated by g,ven 0W MATCH (A,, B k ) 

1 1.0 071 


586 








2 

0.0 

0.33 

3 

1.0 

0.64 

4 

0.0 

0.43 


Even though theoutcomes in these two columns are numerically different,they become qualitatively 
equivalent after thresholding applied to the results produced by the matching modeLLet us define a 
threshold Operation 

I (a, X) = 1 if a £ X and t(a, X)=Oif a<X, 

a,Xe [0,1]. 

where X stands for the threshold value.One can easily verify that for all X from (0.43,0.64) the 
results produced by the model (after thresholding) are equivalent to the these included in the 
training set 

ZJ 


1 JS 

performance 
IndcK u 


.5 




learninf epoch 



Fig.4.Results of learning in the network 


5. Conclusions 

We have developed a distributed model of matching of fuzzy sets utilizing AND and OR basic 
computing elements. It has been shown that they carry out “local” matching which is realized at 
the level of each element of the universe of discourse and includes both grades of membership as 
well as some of their functionals. The global level aggregates them in a disjunctive form. 

The idea of the distributed logical processing can be also found useful in developing models of 
fuzzy set connectives or designing non-pointwise decision-making procedures. 

Acknowledgement 

Support from the Natural Sciences and Engineering Research Council of Canada for both 
authors and MICRONET for the first author is gratefully acknowledged. 

6. References 

1 . A. De Luca, S. Termini, “A definition of nonprobabilistic entropy in the setting of fuzzy sets”. 
Information and Control, 20, 1972, 301-312. 

2. A. De Luca, S. Termini, “Entropy and energy measures of a fuzzy set”. In: Advances in Fuzzy 


587 



Set Theory and Applications, M.M. Gupta. R.K. Ragade, R.R. Yager eda, North Holland, 

3. "Fuzzy sets and statistical data". European J. of Operational Research. 

4. D/DuSisJ^^de, C. Testemale, "Weighted fuzzy pattern matching”. Fuzzy Sets and 

, n yS n,S< 2 H ‘Se 3 'nSe des Possibility. Masson. Paris, 1985. 

6 k! Hirota, W. Pedtycz. Handling fuzziness and randomness in processes o! matching of fuzzy 

1 . rSz. 3 St- &^ie^co^nson of fuzzy dam”. Fuzz, Sms and 

g mlational systems". IEEE Trans, on Pattern Analysis and 

9 ^^^t^^SSS mfemnce neurons as panem classdto”. IEEE Trans. 

lO.aSm^te fc^fuf” SSness of a fuzzy set”. Fuzzy Sent and Sysmms. ,ol. 36. 2. 

1 1 pTovint 2 ^ a class of measures of fuzziness”, J.Math. Analysis and Appl. (in press). 

12. B. SchweSer, A. Sklar, Probabilistic Metric Spaces, North Holland, Amsterdam, 1983. 




UNCLAS 


-VS'' <£/ 
/ £■ / 1) ^ ^ 


N9 3 -29581 


A Conjugate Gradients /Trust Regions Algorithm for 
Training Multilayer Perceptrons for Nonlinear Mapping 

Raghavendra K. Madyastha Behnaam Aazhang 
Department of Electrical and Computer Engineering, Rice University, Houston, TX 
Troy F. Henson Wendy L. Huxhold 
IBM Corporation, Houston, TX 

October 13, 1992 


{ 



Abstract 

This paper addresses the issue of applying a globally convergent optimization algorithm to the training of multi- 
layer perceptrons, a class of Artificial Neural Networks. The multilayer perceptrons are trained towards the solution 
of two highly nonlinear problems: i) Signal detection in a multi-user communication network and ii) Solving the 
inverse kinematics for a robotic manipulator. The research is motivated by the fact that a multilayer perceptron is 
theoretically capable of approximating any nonlinear function to within a specified accuracy. The algorithm that has 
been employed in this study combines the merits of two well known optimization algorithms, the Conjugate Cradients 
and the Trust Regions Algorithms. The performance is compared to a widely used algorithm, the Backpropagation 
Algorithm, that is basically a gradient-based algorithm, and hence, slew in converging. The performances of the two 
algorithms are compared in terms of the convergence rate. Furthermore, in the case of the signal detection problem, 
performances are also benchmarked by the decision boundaries drawn as well as the probability of error obtained in 
either case. 

I Introduction 

Artificial Neural Networks (Neural Nets for short) are densely interconnected layers of relatively simple processing 
units called nodes, that are interconnected through links called weights, (j£ represents the weight vector). The output 
of any node in a layer is a nonlinear function of a weighted sum of inputs from nodes in the previous layer. Due 
to the nonlinear characteristics of these networks, they are used for a wide variety of nonlinear mapping problems. 
This paper deals with two specific applications : detection of a single user’s signal in a multi-user communication 
channel and solution of the inverse kinematics for a robotic manipulator. The neural net used in these problems is 
the multilayer perceptron. 

Multilayer perceptrons are a class of feed-forward artificial neural networks [1, 2, 3] with one or more layers 
(termed hidden lagers) between the inputs and outputs. Their use in this context is based on the fact that multilayer 
perceptrons with a single hidden layer are theoretically capable of approximating any nonlinear function to a desired 
accurrcy [4]. The general dassification/mapping problem can be reduced to solving an optimization problem as 
shown below 

w, = arg min e(j£): R" — ► K . (1) 

ate** 

The optimization algorithm used -o calculate w. that solves (1) is termed the training algorithm of the multilayer 
perceptron. The error fu action e is typically taken to be an average of the sum of the squares of the differences 
between the desired and actual (produced by the neural net) outputs to given inputs 

P M t 

*(&)= p 2222(Ji(p:a)-</i(p)) 7 . ( 2 ) 

pml t*l 

where di(p) is the i 1 * component of the the desired output vector, i.e., the desired output at the t h node (of Ml 

output nodes) corresponding to the input pattern, d;(p; &>) represents the actual output and p = 1 P 

represent the total number of training patterns presented to train the neural network. 



589 



General optimisation problem, a. in (1) kave no analytic solution. and hence, iterative optimisation schemes 
that yield a aeries of converging approximation, to are employed to »olve (1). The focus of this research w 
the development of an efficient training algorithm that perform, “significantly better" than the widely employed 
backpropagation algorithm [1, 3, 5, 6]. Since the backpropagation algorithm is a gradient-based algorithm (it is based 
on the steepest descent algorithm) it exhibit, very poor convergence propertied, being at best linearly convergent (7, 
8 9] We investigate an optimisation algorithm that combine, the merits of two well known optimisahon algorithm. 
• ’the trust region eigoHthm. and the conjugate gradient algorithm. The malting algorithm, termed the Conjugate 
gradient-TVn.t region (CGTR) algorithm ha. been shown to exhibit superhnear convergence properties [10, 11]. The 
CGTR algorithm significantly outperforms the backpropagation algorithm in the applications considered. 


II Multilayer Perceptrons 

Multilayer perceptrons form a particular class of neural networks and are capable of approximating any nonlinear 
meamirable functions. Specifically, it has been shown by Hornik, S»?nchcombe and White [4] that a two-layer 
perceptron, i.e., a perception with an input layer, one hidden layer and an output layer of nodes, is sufficient 
to achieve this approximation. This capability has been exploited in various field, including speech recognition, 
signal and pattern classification and universal approximation (see references in [1]). With reference to a single-user 
problem in a multi-user communication channel, wherein the optimal decision boundary has been shown 
to be a highly nonlinear curve in the signal space [12, 13], multilayer perceptron receivers have been obmaved to 
perform better than conventional techniques [14]. In this study we apply the multilayer perceptron, which is trained 
„.in e t j, e CGTR algorithm as opposed to the conventional backpropagation, to the two specific problems at hand, 
signal detection and nonlinear mapping. 

■•—layer 1 * «- layer 2 ► 



Figure 1: Typical Structure of a 2-Layer Neural Perception 


These networks consist of an input layer of nodes, one or more layers of hidden (i.e., intermediate) nodes and 
a layer of output nodes (Figure(l)). The nodes in a given layer are connected to all the nodes of the next (upper) 
layer. Therefore in an L-layer perceptron, the output of the f h node of the ( h layer takes the value 


Ml- 


„(') 


= ,(£ 1.2, •••.tfr. f = l,2,--,L, 


( 3 ) 




where Mi denotes the number of nodes in the f * layer, denotes the weight associated with the connection 
between the j ,k node of the (f - 1)*‘ layer to the i ,fc node of the f h layer and is the corresponding threshold. 
The function s(-) is the nonlinear transformation at the output of the t ,h node of the t h layer, called the activation 
or tquathing function. In this model, *< 0) represents the j‘ h input to the network and Mo is the total number of 
inputs. The measure of the error «(•) that arises n. turally from the network configuration is the aversge sum of 
squared errors which is calculated as in (2) with l(p; s*) = v\ L) {&.) representing the actual perceptron output. 

1 For details regarding the different types of convergence see (7, 8). 


590 


Ill Conjugate Gradients/Trust Regions Training Algorithm 

The CGTR algorithm is a nested combination of the trust regions algorithm and the conjugate gradients algorithm 
that attempts to circumvent the limitations of both schemes. In order to illustrate the properties of the proposed 
training algorithm, a brief overview of iterative optimisation algorithms is presented below, with emphasm on the 

trust regions model and the conjugate gradient algorithm. 

TheUrie optimisation problem can be formulated a. in (1). where efe) is a twme continuously differentiable 
function of m. The main strategy in most optimisation algorithms is to approximate the nonlinear function 
about the point a k by a second order Taylor series expansion, called the quadratic model of e at 


m*(lsU +i) = e (hU) + Ve (h£») T I + > 


( 4 ) 


where H represents the Hessian matrix of e Le„ {H)., } = Pc/duidu,. One optimisation scheme a to successivdy 
minimise the function along the steepest descent direction, i.e., the negative gradient at each point. Hus algorithm 
has been found to possess extremely slow (linear) convergence properties. The prevalent Backpropagation algorithm 

is based on the steepest descent algorithm. ...... . 

An alternative clans of algorithms that are extremely robust with respect to the wide variety of functions to 
which they are applicable are the trust region algorithms. The main idea behind these methods is that the given 
nonlinear function is approximated fairly accurately by a Taylor’s series quadratic model only in some region around 
the current point. This leads to the following formulation of the optimisation problem 


min m k (u> h + 4 ), subject to ||il| < • 

i€H* 


( 5 ) 


where m*( w* + 4 ) is as defined in (4), 4 is the step taken and but parameter that can be interpreted as an 
estimate of how far we “trust” m k (j^, + 4 ) to accurately model the actual function r(tt) » * neighborhood abot. 
w. The parameter «* is accordingly called the trust radius. The trust region algorithm can then be succinctly 
Mated a. minimising m(w + 4 ) over a compact domain in s at each iteration. This problem has been shown to have 
a unique solution [7, 15, 16) for arbitrary B, but is numerically intractable. 

The conjugate gradients algorithm is an algorithm that arrives at the minimiser of a positive definite n- 
dimensional quadratic function (i.e., the Hessian B is positive definite) in at most n steps [9, 17, 18, 19], taken 
along mutually B-conjugatc 2 directions. It is a computationally simple and elegant algorithm that has nunimal stor- 
age requirements. The rate of convergence of the conjugate gradients algorithm is found to depend on the condition 
number it of H [9 17 70, 21). Therefore, the convergence rate could be enhanced by suitably modifying the Hessian, 
clustering its eigenvalues and thereby decreasing a. This technique, known as preconditioning, is achieved by affect- 
ing a Umar transformation of the variable space. Notwithstanding the attractive features of the con ,yte gr^mits 
algorithm, it is shown to be numerically unstable when applied to non-positive definite functions [20, 17, 18, 22J. 
Therefore it has to be used in conjunction with a method that allows for indefinite Hessians. 

The proposed CGTR algorithm effectively combines the merits of both the above stated algorithms [10, 11]. the 
problem addressed is as posed by (5), with the imposed constraint being |UI|c < <t, where fellc = (i . <?*}• C 
being the preconditioning matrix. The conjugate gradients algorithm is embedded in the trust region model and 
serves to arrive at the minimizer of m*( w* + 4 ) at each iterate while the trust region part decide, whether a 
particular uu reached is a “suitable* point or not. The trust region algorithm governs the global convergence and 
thus the CGTR algorithm to effectively deal with non-positive definite Hessians. This algorithm has been 

theoretically shown to be superlinearly convergent to a minimizer of the nonlinear function e [10, 11). 


IV Applications 

In view of the deficiencies in the existing tranining algorithms, there is a need for the development of faster and 
more robust algorithms. This paper discusses the application of the CGTR algorithm to the training of multilayer 
perceptions for signal classification and nonlinear mapping (inverse kinematics) problems based on their universal 
approximation capabilities [4]. The multilayer perception is trained in both cases by presenting a set of input data 
and the resultant error function between the desired and actual outputs of the network to arrive at an 

optimal weight configuration a;.. The input-output set in the signal detection scenario consists of P pairs of the 
sampled received signal vector and the corresponding classification, while in the solution of the inverse kinematics 
of a robotic manipulator, the input-o set consists of P pairs of end effector (see Section IV.2) coordinates and 
the corresponding joint angles. Figure 2 
’Two directions p< and p, are said to be tf -conjugate directions if pjHp, = 0V» j. 


591 



r(t) 



Figure 2: Structure of a Neural Net Receiver for Single User Detection in Multiuser Channels. 


IV.l Single User Detection in a Multi-user Communication Channel 

We investigate the feasibility of using multilayer perceptrons with the proposed training algorithm for the detection 
of signals transmitted by a single user in a multi-user channel with additive Gaussian noise. The neural net receiver 
is configured to demodulate the particular user’s signal in the presence of other interfering signals. This is shown to 
be equivalent to approximating a nonlinear function, the optimum decision boundary [12, 13]. Figure 3 

In the general multiple-access communication network [14], K transmitters are assumed to share a radio band 
in time and code domains. A particular user’s transmitted signal is a binary signal set derived from the set of 
coded waveforms assigned to that user. We assume that we are interested in the demodulation of the first user’s 
information packet. The signal at the receiver is the sum of the K transmitted signals in additive channel noise 
(which is assumed to be Gaussian here). 



Figure 3: Optimum Decision Boundary for the Detection of a Single User in a 2-user Channel 

The sampled input vector JJ, to the neural net receiver (see Figure 2) can be written so that the demodulation 
of the first jgnal is viewed as the following classification problem: 

Ho :£L= 4-Aj^ 1 * +s + i 

:£=— A',a (,) + 2 + L (6) 

where a* 1 ' is the spreading code vector of the first user and qis a length - N vector of filtered Gaussian noise samples. In 


592 




this setting, £ represent* the multiple-access interference vector, i.e., the interference due to the presence of the other 
transmitted signals. The optimum decision boundary for the general single-eser detection problem in the presence 
of interfering users has been found to be a highly nonlinear surface in the signal space{13]. Therefore conventional 
matched filter techniques, which generate only linear detection boundaries (see Figure 3} fail to accurately demodulate 
the desired user’s signal. 



Figure 4: Decision boundaries drawn after training with the CGTR and BP algorithms. 


Performance Analysis 

To illustrate the potential of the multi-layer perception for signal detection, a relatively simple example involving the 
detection of a single user’s signal in presence of only one other interfering user is considered. The network used for 
this problem is a two-layer perceptroa with three nodes in the middle layer. This is based on work done by Aashang, 
Orsak and Paris [14] who have conjectured that, since the optimum decision boundary can be approximated by three 
straight lines, three nodes in the middle layer are sufficient for near-optimum demodulation. 

Training of the multi-layer perception is performed by presenting a fixed number of input vectors to the network 
and specifying the corresponding desired outputs. The error function obtained is then minimised with respect to 
the network weights using the CGTR method. In the case of the signal classification problem, the P input data 
represent observations of jj, i.e., actual signal locations with additiv. noise. The relevance of using signal with noise 
as training data lies in the fact that in a practical application the neural net receiver would be receiving noisy data 
and would have to detect the presence of a particular user in the presence of additive noise as well as interfering 
signals. Therefore, training the multilayer perception with noisy data makes the detector insensitive to perturbations 
in the incoming signals. 

The performance of the multilayer perception trained by a particular algorithm in the case of the detection of 
a single user in the presence of interfering users, is assessed by the proximity of the decision boundary drawn by 
the multilayer perception to the optimum and the average probability of mis-classification. Figure 4 shows the 
decision boundaries drawn by the neural net receiver trained with the CGTR and backpropagation algorithms. 
As can be seen, the decision boundary drawn after training with the CGTR algorithm closely approximates the 
optimum decision boundary, while the neural net trained with the backpropagation algorithm is only able to linearly 
approximate this nonlinear function. Further comparisons of performance can be made by observing the plots of the 
probability of error for the receiver versus the ratio of the signal-noise-ratios of the two signals after training with 
both the algorithms, as seen in Fig 5. The first plot in Fig S depicts the probability of error (Pe) graphed against 
the ratio of SNRl (signal-noise-ratio of user 1) to SNR2 (signal-noise-ratio of user 2), with SNR2 fixed at 6dB. The 




593 






Figure 5: Probability of error vs ratio of the SNR’s of the two signals. 

second plot depicts Pe graphed against the ratio of SNR2 to SNRl, with SNRl fixed at 6dB. As can be seen in both 
plots, the receiver trained with the CGTh algorithm yields a lower probability of error compared to the receiver 
trained with backpropagation. 

IV. 2 Inverse Kinematics for a Robotic Manipulator 

The capability of the multilayer perceptron for function approximation is further tested by trailing it via the 
preconditioned CGTR algorithm to approximate the inverse kinematics for a robotic manipulator. We briefly 
describe the inverse kinematics problem for a robotic manipulator. For further details the reader is referred to 
[23, 24, 25}. A mechanical manipulator or arm can be modeled as an interconnection of several links, each of which 
is connected to its predecessor through a joint. One end of the arm is attached to a base and the other end, to an 
end-effector or gripper. Robot manipulator kinematics deals with the analytic study of the motion of the robot arm 
with respect to a particular coordinate system. The inverse kinematics problem for a robotic manipulator involves’ 
the determination of the individual joint angles (angles between successive joints) 0(t) -■ [0i(t), . . . ,fl n (t)] T ', given 
the spatial location of the end-effector r (<). This problem is solved using the equation 

£(*) = /(«(«)), (7) 

where / is a nonlinear function. In general, for most manipulators there does not exist a continuous analytic f~' 
over the whole space end even if it does, its solution can be analytically and numerically cumbersome. 


594 




In this paper we attempt to solve the inverse kinematics problem by training the multilayer perceptron to 
approximate f~ x . The robotic manipulator arm considered in this study is a planar bi-linked arm, i.e., an arm with 
two segments constrained to lie in a single plane, as depicted in Fig 6. The joint angles 9\ (t) and (t) are calculated 
as 


0 2 (t) = cos 


0,(0 = tan 


' V ) 

-1 ( tin -1 ( ^ 


The neural net is trained using the CGTR algorithm to approximate the above system ol nonlinear equations. 


Performance Analysis 



Sir* of Mails Mt-> 

Figure 7: Multi-layer perceptron error function with increasing training set sire. 






The training is carried out by presenting pairs of input vectors and the corresponding desired joint angles to the 
neural net and minimising the corresponding error function, as in (2), using the CGTR algorithm. The network 
configuration considered for this example is a three-layered network with 4 nodes in each of the middle layers. It has 
been observed that a three-layer network with as few as four nodes in each of the middle layers yields much better 
performance than a two-layer network with as many as 30 nodes in the middle layer. The number of input and 
output nodes correspond to the dimension of the work-space and the number of joint angles respectively. Figure 7 
shows the change in the perception training error (for the three-layer perception with 4 nodes in each middle layer) 
with increasing training set rise and as can be seen, the error stablizes after a certain point (400 points in this case). 

V Conclusions 

We have demonstrated the potential of the CGTR training algorithm for multilayer perceptrons and compared it to 
the existing backpropagation algorithm. The CGTR algorithm performed significantly better than backpropagation 
in the applications considered. Specifically, in the case of the detection of a single transmitter’s signal in the presence 
of interfering users, the network trained with backpropagation was able to draw only a linear decision boundary 
compared to the near-optimum decision boundary obtained by training with the CGTR algorithm. Correspondingly, 
training with CGTR resulted in a lower probability of error. In the solution of the inverse kinematics problem, our 
precursory results have demonstrated the effectiveness of the CGTR algorithm in enabling the multilayer perceptron 
to successfully approximate the nonlinear functions involved. Further research is being carried out using two-layer 
perceptrons with a larger number of nodes in the middle layers as well as four-layr perceptrons. 

References 

[1] R. P. Lippmann, “An Introduction to Computing with Neural Nets," IEEE ASSP Magazine, vol. 4, no. 2, 
pp. 4-22, April, 1987. 

[2] M. Minsky and S. Papert, Perceptrons: An Introduction to Computational Geometry. MIT Press, 1969. 

[3] D. E. Rumelhart, G. E. Hinton, and R. I. Williams, “Learning Internal Representation by Error Propagation,” 
in Parallel Distributed Processing: Explorations in the Microstructure of Cognition. Vol. I: Foundations (D. E. 
Rumelhart and J. L. McClelland, eds.), pp. 318—362, MI F Press, 1986. 

[4] K. Hornik, M. Stinchcombe, and H. White, “Multilayer Feedforward Networks are Universal Approximators,” 
Journal on Neural Networks, vol. 2, pp. 359-366, 1989. 

[5] P. Werbos, Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences. PhD thesis, 
Harvard University, Cambridge, MA, 1974. 

[6] F. Rosenblatt, “Perceptions and the theory of brain mechanisms," in Principles of Neurodynamics, Washington, 
D.C: Spartan Books, 1962. 

[7] 1. E. Dennis, Jr., and R. B. Schnabel, Numerical Methods for Unconstrained Optimization and Nonlinear 
Equations. Prentice-Hall Series in Computational Mathematics, Englewood Cliffs, New Jersey: Prentice-Hall 
Inc.,, 1983. 

[8] J. M. Ortega and W. C. Rheinboldt, Iterative Solution of Nonlinear Equations in Several Variables. New York: 
Academic Press, 1970. 

[9] M. Avriel, Nonlinear Programming: Analysis and Methods. Prentice-Hall Series in Automatic Computation, 
Englewood Cliffs, New Jersey: Prentice- Hall, 1976. 

[10] T. Steihaug, “The Conjugate Gradient Method and Trust Regions in Large Scale Optimization,” Tech. Rep. 
Rice MASC TR81-1, Department of Mathematical Sciences, Rice University, Houston, TX, 1982. 

[11] T. Steihaug, Quasi- Newton Methods for Large Scale Nonlinear Problems. PhD thesis, Yale University, New 
Haven, Connecticut, 1980. 

[12] H. V. Poor and S. Verdu, “Single-User Detectors for Multiuser Channels,” IEEE Trans. Commun., vol. COM-36, 
no. 1, pp. 50-60, January, 1988. 

[13] S. Verdu, “Optimum Multiuser Asymptotic Efficiency,” IEEE Trans. Commun., vol. COM-34, pp. 890-897, 
September, 1986. 

[14] B. Aazhang, B.-P. Paris, and G. Orsak, “Neural Networks for Multiuser Detection in Code-Division Multiple- 
Access Communications,” IEEE Trans. Commun., vol. COM-40, no. 7, July, 1992. 

[15] J. J. More and D. C. Sorensen, “Computing a Trust Region Step,” SIAM J. SCI. STAT. COMPUT., vol. 4, 
pp. 553—572, September 1983. 

[16J D. C. Sorensen, “Newton’s Method with a Model Trust Region Modification,” SIAM J. Numer. Anal, vol. 19, 
pp. 409-426, 1982. 


596 



i’ 


[lf\ M. R He*tene* and E. Stiefcl, ‘Method* of Conjugate Gradient* for Solvinn Linear - i , , D 

Nat. Bureau of Standard*, vol. 49, pp. 409-436, December 1952. * s r*t«*u, Journal of Re*. 

1181 DinC,i ° n Mtth0dt ^ ° ptiniiation - Application* of Mathematic, New York: 

[19J Mcond ed!*19M. d V ' L °“* *“*** Com » ,u<a ‘ i< ”“- Bdtimore: The John. Hopkin. Univ^ty Pc, 

f20j OriiSdb^^ S0 ' U<^0,, ° fB ° Undar ' 1 **»**—: Theory and Computation. 

^ 9.\ n Xe ^*° B ’ ,”? olu V OB °* linear Systems of Equations: Iterative Method* * in Soaree Matrix T r h ■ 

(✓. Barker, ed.), vol. 572 of Lecture Note* in Mathematic, pp. 1-51, New Ymk: 

22 L G 'ju“ h l , 9*85 reCOnd,ti °* il18 ° fTtUnCUed ‘ NeWt0n Method8 -” SIAM *■ SCI. STAT. COMPVT., vol. 6, pp. 599- 
C231 L D ^’-.“,^‘mce n ulS S PCd le “ t ‘* qU "“ meth ° d 40 iBVe " e ofwbotic manipulators," 

[24) McSrawH^igST. 0 ' ^ C#nM - V "*°" "* Engineering Seric, 

[25] Y. Nakamura and H. Haaafusa, ‘Invene kinematic solution* with singularity robustness for . 

control, Journal of Dynamic Syttemt, Measurement and Control, pp. 163-171, September 1986. P" 1 ** 01 






UNCLAS 


N93-29582 

ON PROBABILITY-POSSIBILITY TRANSFORMATIONS ^ 


George J.Klir v* 

Department of Systems Science 
Thomas J. Watson School of Engineering and Applied Science 
State University of New York 
Binghamton, New York 13902, U.S.A. 



and 


Behzad Parviz 

Department of Mathematics and Computer Science 
California State University 
Los Angeles, California 90032, U.S.A 

ABSTRACT 

Several probability-possibility transformations are compared in terms of the 
closeness of preserving second-order properties. The comparison is based on 
experimental results obtained by computer simulation. Two second-order properties 
are involved in this study: noninteraction of two distributions and projections of a 
joint distribution. 


1. Introduction 

During the last three decades or so, science has been undergoing a major 
paradigm shift involving attitudes towards uncertainty. The many facets of this 
paradigm shift are well described in a book by Smithson [1989]. 

The old paradigm is characterized by the pursuit of absolutely certain 
knowledge or, if impossible, by resorting to probability theory, as the only legitimate 
mathematical tool to deal with the lack of certainty. The new paradigm, on the 
contrary is not only tolerant of uncertainty, but it views uncertainty as an important 
resource in pursuing knowledge. While uncertainty is not desirable for its own sake, 
its role to counterbalance complexity is crucial when complexity is unmanageable or 
when dealing with it is prohibitively expensive [Zadeh, 1973]. When the solution to 
a problem is not required to be uncertainty-free, the computational complexity 
involved may often be substantially reduced [Traub, et ak, 1983]. 

In order to utilize uncertainty as a strategic resource, we need to understand 
it as broadly as possible. It turns out that probability theory does not facilitate 
sufficiently broad framework for this purpose [Klir, 1989a]. As a recognition of the 
limitations of probability theory, two generalizations in mathematics have emerged. 
One of them is the generalization of classical set theory into fmny set tiieory which 
allows us to deal with sets that do not have sharp boundaries [Zadeh, 1965; Klir and 
Foiger 1988]. The second is the generalization of classical measure theory into 
fuzzy measure theory, which allows us to deal with measures that are not additive 
[Sugeno 1977; Wang and Klir, 1992]. These two theories can be combined. 

Fuzzy set theory and fuzzy measure theory, as well as their combination. 


598 


provide us with a veiy broad mathematical frameworks for investigating uncertainty 
withm which vanous special theories of uncertainty can be tondff 
these special uncertainty theories are of our interest in this paper classic# 

£££ theo,y - We aS!Ume t 

As we argue elsewhere [Klir and Parviz, 1992a}, probability theory and 
possibility theory are complementary, but not comparable. They are capable of 
describing different types of uncertainty. It is often desirable to transform 
uncertainty described in one of the theories to the complementary description in the 

^ er ,^ Prade > 1986; Bharathi-Devi and Sarma, 1985- Leung. 1982* 

Moral, 1986; Klir, 1991]. Several distinct, transformations have been propS in ’ 
the literature for this propose. Our aim in this paper is to compare these^ 
transformations in terms of the closeness of preserving second-order properties The 
comparison is based on experimental results obtained by computer simulation. ’ 

The paper is a continuation of a previous study [Klir and Parviz, 1992bl 
While the previous study focuses only on one second order propemjoint * 
distribution calculated from two noninteractive marginal distribution this oaner 

covers also projections of given joint distributions. Furthermore it describe^suhe 

based on a shghtiy different measure of uncertainty that the one employed in the 
previous study. The new measure of uncertainty emerged recently as a better 

justified alternative [Klir and Parviz, 1992c]. y a fetter 

2. Probability-Possibility Transformations Investigated 

In order to describe the probability-possibility transformations that are the 
subject of our experimental study, let “ e 


P CPt» P2» •••* Pn)i 


r ~ ( r l» r 2> •••» r n)* 


denote, respectively, an ordered probability distribution ana 
ordered possibility distribution. We assume that ft 2 p, . and r-YrTforalfi = i 
2 ’ "*’p" 1 = ? normali2ation requirements of the two theories, p, + p 2 + ' 


The first type of probability-possibility transformations 
by our study are transformations based on ratio scales. Thev 
equations 3 


P ** r that are covered 
are expressed by the 


r i 


Pi 

Pi’ 


( 1 ) 


Pi 


r i + r 2 •- + r - 


( 2 ) 


• • H 

The second type of transformations p~r, which are often cited in the 
literature, were proposed by Dubois and Prade [1982, 1983, 1986]. They are defined 


59 & 



by the equations 


0 

r,»£ minCpj.p,), 
j-i 

(3) 

p,-t <r, ' ri * ,) . 

j-« J 

(4) 


where r n+1 = 0 by convention. 

The third type of transformations p ** r, which are asymmetric, were 
proposed by Dubois, Prade, and Sandri [1991]. In one direction, p -* r, they are 
defined by the equation 


r » 


E Pr 


J-i 


(5) 


In the other direction, r -*■ p, they are defined by Eq. (4). 

The fourth type of transformations p ** r, which were proposed by Klir [1989, 
1990], are transformations that preserve uncertainty. It was shown by Geer and Klir 
[1992] that unique transformations of this kind exist only under log-interval scales. 
They are defined by Eqs. I and HI in Figure 1. The value of a in these equations is 
determined by solving Eq. II in Fig. 1, which expresses the requirement that the 
amount of uncertainty be preserved when p is transformed to r or vice versa. 
Functions H, N, S in Eq. II are defined by the following formulas [Klir and Parviz, 
1992c; Klir, 1993]: 


n 

H(p) = PilogjPj, 


i - 1 


( 6 ) 


N(r) = £ ( r i "Tj^logji, 


i-2 


(7) 


S(r)*E ( r i~ r i>i) l0 & 


E 


( 8 ) 


Function H is the well-known Shannon entropy [Klir and Folger, 1988], and 
functions N,S are referred to as nonspecificity -md strife, respectively. 

Since the value of S(r) is severely restricted when compared with the value of 
N(r), as shown by Geer and Klir [1991] and Klir and Parviz [1992c], the term S(r) 
plays a relatively minor role in Eq. II. To study the effect of this term on results, we 
performed experiments both with and without the term. Furthermore, we performed 
also experiments in which function S in Eq. II is replaced with function D defined by 
the form 


600 





(9) 


D(r) 


n~l 


s 


- E ( f i - r i*l > l0 82 

i-1 


i-'E 

j —i ♦ 1 



0 - 1)1 


This function, referred to as discord, was employed prior to the discovery of the 
latter justified function S (Klir and Parviz, 1992c]. We have performed experiments 
with both functions in order to compare their performance. 

That is, our experiments involve six distinct probability-possibility 
transformations. The following are convenient abbreviations of these 
transformations: 


• RC - ratio-scale transformations defined by Eqs. (1) and (2); 

• DP - transformations proposed by Dubois and Prade [1983], which 

are defined by Eqs. (3) and (4); 

• AS - asymmetric transformations defined by Eqs. (4) and (S) 

[Dubois, Prade, and Sandri, 1991]; 

• NS - transformations that preserve uncertainty, which are defined by 

Eqs. I - in in Fig. 1; 

• N - same as NS except that S(r) is excluded from Eq. II; 

• ND - same as NS except that S in Eq. II is replaced with function D 

defined by Eq. (9). 

3. Description of Experiments 

Two classes of simulation experiments regarding the six types of probability- 
possibility transformations are reported in this paper. The purpose of experiments 
of the first class is to compare the transformations by the estimated average degree 
to which they preserve joint distributions constructed from noninteractive mar ginal 
distributions. The estimates are obtained by experiments of two types. 

In each experiment of the first type, marginal probability distribution p, and 
P 2 are chosen for some n £ 2. Assuming that Pj and p^ are noninteractive, we 
calculate the joint probability distribution p by taking the pair-wise product of their 
components. Next, we convert Pj, pj into the corresponding marginal possibility 
distributions r v r 2 by each of the transformation methods and combine each pair of 
distributions by taking the pair-wise minimum of their components. This results in 
one joint possibility distribution for each transformation method, which we convert 
(using the same method) to the corresponding probability distribution p*. Now, we 
determine the closeness of p' to p in terms of these criteria: Hamming distance. 
Euclidean distance, and the maximum error. 

Experiments of the second type are similar, but they begin with given 
marginal possibility distributions, r, and i* for which the joint possibility distribution 
r is calculated by using the minimum operator. The given distributions are also 



601 




a** 





*ft.T 



transformed into the corresponding marginal probability distributions p t and fe by 
each of the method. Joint probability distribution p % then calculated for each pair 
Pj, p 2 and transformed (by the same method) to the corresponding possibility 
distribution r*. Finally, r and f are compared in terms of the same criteria as in 
experiments of the first type. 

The purpose of experiments of the second class is to compare the six 
transformations by the estimated average degree to which they preserve marginal 
distributions calculated from given joint distributions. Two types of experiments are 
again distinguished, depending on whether the inputs are probability distributions or 
possibility distributions. The results are compared in terms of the same criteria as in 
the experiments of the first class. 

4. Experimental Results 

Experiments of the two classes and both types were performed for selected 
values of n, from n=2 to n = 25. In each category and for each particular value of 
n, the performance of transformations RS, DP, and AS was compared with the three 
variants of uncertainty-preserving transformations, ND, NS, and N, in terms of the 
Hamming distance, the Euclidean distance, and the m axi mum error. 

Results of experiments of the first class that are based on ND are published 
in a previous paper [Klir and Parviz, 1992b]. They demonstrate, with considerable 
consistency, that the uncertainty-preserving transformation performs best according 
to each of the three indicators and that its relative performance increases with 
, nagging n. The results also show that AS is substantially outperformed by all the 
other transformations. 

After these initial results, we extended the experiments of the first class to NS 
and N. We discovered that NS also performs better than RS, DP and AS, but it is 
slightly outperformed by ND. However, the difference between the two 
performances decreases with increasing n. 

The behavior of N, which is illustrated by the selected data in Table I, is 
more interesting. While N is comparable with or even slightly weaker than RS and 
DP for values of n (approximately n £ 5), it outperforms all the other 
transformations (including NS and ND) for large values of n (approximately n 2: 

10). For both types of experiments, the table is divided into three parts that contain 
values of the Hamming distance, the Euclidean distance, and the maximum error (in 
this order). F?** ynlnmn in the table represents one of the four conversion 
method as applied in experiments of either the first type or the second type. All 

in the table are average values based on 100 experiments for randomly 
selected mar ginal distributions. 

Results of experiments of the second class for n = 5, 10, 15, 20, 25 are given 
in Table n (first type) and Table III (second type). As in Table I, the three parts in 
either table c o n « ai " values of the Hamming distance, the Euclidean distance, and the 
Tnavimum error. Covered are all the six transformations introduced in Sec. 2. 

We can see from Table HI that each of the uncertainty-preserving 
transformations heavily outperforms transformations RS, DP, and AS in experiments 
of the second type. The order of the transformations by their performance is 
consistently N, NS, ND, DP, RS, AS according to each of the three indicators. The 


602 



IX* 



strong performance of transformations N is rather surprising. 

According to experiments of the first type (Table II), the transformations are 
less discriminated by their performance, but NS (and ND, to a lesser degree) 
consistently outperforms the other transformations. The performances of N, RS, and 
DP are comparable and consistently higher than the performance of AS. 

5. Conclusions 

From all experimental results obtained in this experimental study, which are 
exemplified by five selected values of n in Tables I - HI and in our previous paper 
[Klir and Parvis, 1992b], we may conclude that the uncertainty-preserving 
tr ansformati ons are superior in terms of the degree to which they preserve the two 
second order properties investigated. This is not surprising since the uncertainty- 
preserving transformations neither loose information nor add extraneous information 
by the transformation process itself. It is reasonable to expect th&t the same 
conclusion will be obtained for other second order properties, such as conditioning 
or joining of overlapping distributions. We intend to validate this conjecture tty 
additional experiments. 

Although we consider three variants of uncertainty-preserving 
tr ansfor mations, NS, ND and N, the differences among their performances are not 
large. This is a result of the fact that functions S and D are bounded from above by 
the same value, which is rather small [Geer and Klir, 1991]. One the whole, 
transformation N appears to be the best choice, not only due to its high performance 
in most cases, but also due to its simplicity. 

Functions S and D (and the associated functions NS and ND) are still 
somewhat controversial as measures of possibilistic uncertainty, while function N 
alone does not represents the whole uncertainty [Klir and Parviz, 1992c; Klir, 1993]. 
If this controversy is resolved by determining a fully-justified measure of total 
uncertainty, the performance of the resulting uncertainty-preserving transformation 
will almost certainly outperform all the three currently considered uncertainty- 
preserving transformations. 

Transformations NS, ND and N are based on log-interval scales and, as a 
consequence, they are unique. Uncertainty-preserving transformations may also be 
based on ordinal scales. Such transformations, which are not unique, may give us a 
greater flexibility in achieving desirable results, such as preserving best certain 
second order properties of the given distributions, maximizing the degree of 
probability-possibility consistency, and the like. A formulation of ordinal-scale 
transformations that preserve uncertainty and discussions of several other issues 
regarding probability-possibility transformations are included in another paper [Klir 
and Parviz, 1992a]. 

Acknowledgment 

This work was supported in part by the National Science Foundation under 
Grant No. IRI-90 15675. 


603 



REFERENCES 


Bharathi-Devi, B. and V.V.S. Sarnia [1985], "Estimation of fuzzy memberships from 
histograms.” Irrformation Sciences, 35(1), pp. 35-59. 

Delgado, M. and S. Moral [1987], "On the concept of possibility probability 
consistency.” Fuzzy Sets and Systems, 21(3), pp. 311-318. 

Dubois, D. and H. Prade [1982], "On several representations of an uncertain body of 
evidence." In: Fuzzy Information and Decision Processes, ed. by M.M. Gupta 
and E. Sanchez, North-Holland, New York, pp. 167-181. 

Dubois, D. and H. Prade [1983], "Unfair coins and necessity measures: towards a 
possibilistic interpretation of histograms." Fussy Sets and Systems, 10(1), pp. 
15-20. 

Dubois, D. and H. Prade [1986], "Fuzzy sets and statistical data." European J. of 
Operations Research, 25, pp. 345-356. 

Dubois, D., H. Prade, and S. Sandri [1991], "On possibility/probabflity 

transformations." Proc Fourth IFSA Congress'Mathematics, Brussels, pp. 50- 
53. 

Geer, J.F. and GJ. Klir [1991], "Discord in possibility theory." Intern. I. of General 
Systems, 19(2), pp. 119-132. 

Geer, JJF. and GJ. Klir [1992], "A mathematical analysis of information-preserving 
transformations between probabilistic and possibilistic formulations of 
uncertainty." Intern. J. of General Systems, 20(2), pp. 143-176. 

Klir, GJ. [1989a], "Is there more to uncertainty than some probability theorists 
might have vs believe?" Intern. J. of General Systems, 15(4), pp. 347-378. 

Klir, GJ. [1989b], "Probability-possibility conversion." Proc 3rd IFSA Congress, 
Seattle, pp. 408411. 

Klir, GJ. [1990], "A principle of uncertainty and information invariance." Intern. J. 
of General Systems, 17(2-3), pp. 249-275. 

Klir, GJ. [1991], "Some applications of the principle of uncertainty invariance." 

Proc Intern. Fuzzy Eng. Symp ., Yokohama, Japan. 

Klir, GJ. [1993], "Developments in uncertainty-based information." In: Advances in 
Computers , Vol. 35, ed. by M.C. Yovits, Academic Press, San Diego. 

Klir, GJ. and TA. Folger [1988], Fuzzy Sets, Uncertainty, and Information. Prentice 
Hall, Englewood Giff, NJ. 

Klir, GJ. and B. Parviz [1992a], "Probability-possibility transformations: A 
comparison." Intern. J. of General Systems, 21(4), pp. 

Klir, GJ. and B. Parviz [1992b], "Possibility-probability conversions: an empirical 

study.” In: Progress in Cybernetics and Systems, ed. by R. Trappl, Hemisphere, 
New York. 

Klir, GJ. and B. Parviz [1992c], "A note on the measure of discord." Proc Eighth 
Conf. on Uncertainty in Artificial Intelligence, Stanford. 

Leung, Y. [1982], "Maximum entropy estimation with inexact information.” In: Fuzzy 
Set and Possibility Theory, ed. by R.R. Yager, Pergamon Press, Oxford, pp. 32- 
37. 

Moral, S. [1986], "Construction of a probability distribution from a fuzzy 

information." In: Fuzzy Sets Theory and Applications, ed. by A. Jones, A. 
Kaufmann, and HJ. Zimmermann. D. Reidel, Dordrecht, pp. 51-60. 


604 


Smithson* M. [1989), Igrcnmce end M +V S"**"- 

N~ Yottpp. 89-102. 

„ *. ^ ?^WMilkowskTaS H. Womiakowski [19831 Information. 

Traub,J Addison- Wesley, Reading, Mass. 

IMcam^y, Ccmplety. M Weajure ^ Press, New York. 

StTl^-FaLysek* 

3(1), PP- 28-44. 

TABLE I, Experiments of the Erst dess (ftom nutninel distribotioo, to joint 
distributions). 


Full Type 

N «S 

0,1701 M 725 

0.1776 qgUO 

01772 _ 

0,1727 CJMl 

01791 «2J33 

0,0444 0-0*37 

0X231 O0M9 

0.0151 00178 

0.0112 04)134 

0jC092 

0X007 04)178 

04WS8 04)071 

04X06 00033 

04)015 0O0» 

0.0010 00014 


DP 

0,1632 

qi894 

qi963 

01954 

04)464 

00291 

0,0207 

04)155 

0,0130 

q0296 

04)163 

ooioa 

q0066 

04)050 


Second type 
N 

1*8962 

5.1634 

134)548 

204573 


06706 

114490 

46.9769 

■MM 

106.9599 | 

03325 

04047 

05132 

05642 

13115 8 

02419 

06549 

09117 

09621 

21100 | 

01907 

14)176 

13935 

16742 

33802 ' 

01569 

127 12 

24)014 

21135 

42590 

01367 


24779 

■MM 

54)989 

04212 

04258 

03127 

02197 

01549 

01597 

02073 

02272 

01968 
024 tO 

01656 

01546 

02390 

02959 

04283 

01311 

01550 

02419 

03319 

04278 

01113 

01550 

02445 

03115 

■mm?— 


_J h 


605 



TABLE IL Experiments of the second class and first type. 


a 

NS 

ND 

N 

RS 

DP 

AS 

s 

02490 

02606 

02473 

02651 

02676 

02495 

10 

02006 

02096 

02024 

02126 

02152 

02591 

hi 

01812 

01814 

0.1849 

0.1826 

01842 

02148 

. 

01518 

01528 

0.1555 

01537 

0.1551 

01766 

25 

01415 

01409 

01452 

01416 

01426 

01578 

- 5 - ' 

01317 

01377 

01309 

01399 

01412 

01854 

10 

00775 

0060$ I 

0.0783 

0.0815 

00825 

0.1005 

15 

00571 

00573 

0.0583 

00577 

00582 

00684 

20 

C.0418 

00421 

00428 

00423 

0XM27 

00485 

25 

00350 

00349 

0.0359 

0.0351 

00353 

00391 

5 

0.0945 

0.0971 

0.0944 

0X1964 

00993 

0.1339 

fl 10 

00466 

00480 

0.0472 

0.0485 

00490 

00619 

1 15 

00300 

00301 

0.0306 

0.0303 

00306 

00366 

H 20 

00202 

00206 

0.0207 

00207 

00209 

00236 

B 25 

00158 

00157 

00161 

0.0158 

00159 

00178 

TABLE III. 

Experiment 

s of the second class and second type. 

n 

NS 

ND 

” ' 

N 

RS 

DP 

AS 

5 

04136 

05870 

0.1866 

12693 

07449 

22727 

10 

05109 

1.0152 

02233 

2.7/72 

12854 

5.1137 

15 

0.4282 

12133 

0.2005 

4.0189 

1.7846 

72077 1 

20 

03843 

1.3729 

0.1935 

5.1132 

2.0965 

10.4588 

1 25 

03592 

15083 

0.1941 

6.1502 

22972 

132046 

I 

02481 

03472 

0.1106 

Q.7224 

04711 

12184 

10 

02121 

04065 

0X1926 

1XM51 

06150 

12660 

15 

01407 

03824 

0.0660 

1.1987 

06485 

22112 

20 

01087 

03666 

0.0551 

12022 

06646 

26762 

25 

00895 

03570 

00490 

12872 

06799 

29965 

5 

01974 

02711 

00879 

05250 

02833 

02383 

10 

01394 

02528 

0.0613 

0.5695 

06221 

09159 

15 

00787 

01994 

00373 

05468 

02971 

0.9413 

20 

00559 

01730 

00288 

05275 

02799 

09528 

25 

00419 

01506 

0X037 

05080 

02631 

09622 


606 






Figure 1. Uncertainty- invariant transformations between 
probabilities and oossibilities based on loo-interval 
scales. 


607 


ry*j 



/u>*7 

Inference in fuzzy rule bases with conflicting 

evidence 

Ldszlo T. Koczy * 

Department of Telecommunication and Telematics, 

Technical University of Budapest, 

Sztoczek u. 2, Budapest H-llll, Hungary 


1 Introduction 

Inference based on fuzzy ’If ... then’ rules has played a very important role 
since when Zadeh [13] proposed the Compositional Rule of Inference and, 
especially, since the first succesful application presented by Mamdani et al. 
[10]. From the mid 1980’s when the ’fuzzy boom’ started in Japan, numerous 
industrial applications appeared, all using simplified techniques because of 
the high computational complexity. Another feature is that antecedents in 
the rules are distributed densely in the input space, so the conclusion can be 
calculated by some weighted combination of the consequents of the matching 
(fired) rules. The CRI works in the following way: If R is a rule and A* is 
an observation, the conclusion is computed by B* = JZo A* (o stands for the 
max- min composition). Algorithms implementing this idea directly have an 
exponential time complexity (maybe the problem is NP-hard) as the rules 
are relations in X X V, a k\ x kj dimensional space, if A is ki, Y is k? 
dimensional. For a detailed analysis of the complexity see [3]. 

The simplified techniques usually decompose the relation into ki pro- 
jections in Xi and measure in some way the degree of similarity between 
observation and antecedent by some parameter of the overlapping. These 
parameters are aggregated to a single value in [0, 1] which is applied as a 
resulting weight for the given rule. The projections of rules in dimensions 

'This work was done while a visiting appointment at the Department of Computer 
Science, Pohang Institute of Science and Technology, Pohang, Kyongbuk, P.O.Box 125, 
790-330, Korea 


A /o 



Yi are weighted by these aggregated values and then they are combined in 
order to obtain a resulting conclusion separately in every dimension. 

This method is unapplicable with sparse bases as there is no guarantee 
that an arbitrary observation matches with any of the antecedents (cf. [14])r 
Then, the degree of similarity is 0 and all consequents are weighted by 0. 
Some considerations for such a situation are summarized in the next sections. 


2 The semantical interpretation of inference 

The rules we deal with in this paper have the form 
’IfX is A, thenY is Bi’ 

Such a rule is represented by relation 

Ri(x, y) = min{A{(x), S,(y)} 

This interpretes Ri as a ’fuzzy point’ in X x Y and so the whole rule 
system describes in some way a fu*zy function y — 7Z(x). For a thorough 
analysis of rule interpretations see [1]. 

An observation ’X is A m ’ is a fuzzy value of X and is transformed to 
X x y in the form of its cylindric extension 

0(x,y) = A’(z) 

. For rule system R = {Ri'.ie N r } the fuzzy conclusion in X x Y is 

(7(x,y) = max { {min{Ri(x,y),0(x,y)} 

in X x Y, and its projection to Y is 

B‘(y) = sup s {moz,{m»n{i2,(i,y),0(z,y)} 

This algorithm of inference estimates the value y = R(A m (x)) by B m (y). 

B"(y) £ 0 only when the antecedent parts A; cover the input space, i.e. 
for every i there is always at least one such rule Ri that x € supp(Ri). In 
sparse rule bases [14], this kind of inference results in no conclusion. 

The approach of Turkmen dealing with this kind of problems [11] uses the 
si mil arity measure of two fuzzy sets: similarity measure ~ (1+distance measure) 
With the usual crisp distance measures of fuzzy sets, the similarity measure 
defined in this way has its range in [0, 1], but it results in 0 when the two 
fuzzy sets have disjoint supports. In the next, this idea will be extended to 
arbitrary rule bases. 


609 



3 Gradual metric variables and fuzzy distance of 
fuzzy sets 

Variables in real control applications have usually comparable and measur- 
able values. In some other examples like when classifying tomatoes according 
to ripeness on the basis of their colours (see [4, 6]), a similar comparability 
and at least some ’pseudo-measurability’ appears. 

In [2] a very interesting interpretation of the semantic contents of fuzzy 
rules is proposed: 

’If X is A then Y is B’= ’The more X is A the more Y is B’ 

The idea of gradual rules in [2] is in accordance with the analogical reasoning 
in [11] can be interpreted as: 

’The more similar is x to A the more similar is y to B. 

Gradual rules exist because the variables appearing in them are gradual. 
Gradually means mathematical that a full ordering can be defined over the 
variables. In practical cases, domain and range of the variables are finite, so 
max{X), min{X}, etc. exist. If X and Y are compound, their components 
are bounded sets with a full ordering, so a partial ordering exists in both A' 
and Y: 

Xi < X 2 iff Vi : xi.i < *z ,i etc. 

Also the overall minima and maxima exist. 

Beside ordering, measurability can be observed: as e.g., lCt-°C is far- 
ther from 12°C than from 67°C, etc. So the distance of two values can be 
expressed. In the case of many originally non measurable variables, some 
natural mapping of the range to the interval [0,1] provides virtual mea- 
surability. Variables measurable in any sense will be named metric. Even 
tomato colours or degrees of ripeness are metric so, as a mapping from 
[deep green , deep red] to [0, 1] can be introduced. 

The fuzzy distance between linguistic (fuzzy) sets is defined with help of 
the Resolution Principle, for pairs of fuzzy sets satisfying the partial ordering 
A<B. < is introduced over V{Xi), the set of all convex and normal fuzzy 
sets of Xi, so that for A, B G "P(Xi) A -< B if 

Va G (0, 1] : inf{A a } < inf{B a } and sup{4*} < sup{B a } 
a subset of V 2 (X), is the relation of all comparable pairs: 

TI+ = {[A,B)\A,B G P(X),A -< B} 


610 


A £07™ ,a!ly A “ d B in ^). *he Wr yte, distune of 

2?) : -*> P([0, 1]) and 

= £ a /D(inf{A a }, inf{B a }), 6 6 [0, 1] or 6 € [0, v^Tl 

<*€[0,1] l > V Ij 

. IS* 1!?" d f‘““ <'<'<'>• ■»> ■» <M*d. In the above, D 

" oT; m ”' 8 *” r *“ r Minkowski ) <“»■“« °f A and 
-o* (ror more details see [9].) 

Considering R from the point of view of the Resolution Principle every 
rule is resolved to a family of o-rules: * 

7/ X is a4, Q then Y is B ia ’ 

The a-cuts are represented by k r and * 2 -dimensional hyperintervals in A' x 
Y. Every hypermterval has its infimum and supremum, so if a is set fix 
every rule can be unambiguously described by a pair of points inXxY i e’ 

W 6 th £ ^ J ? a ( ? r POint ’ } ° ne for the su P rema (’«PP« Point’)' 

r r °! A i “V* h iS SUffident t0 represent ev «y rule by 
2| U ‘ (Aa * U A *)l P° mts - U such a way every rule base consisting of r rules 
is represented m X x Y for given a and L or U by exactly r points. 

4 Linear interpolation of rules 

Extended gradual rules can be interpreted by the simple linear ratio: 
dist(A', A{) : dist(A', A 2 ) = dist(B m , B t ) : dist(B m , B 2 ) 
if A\ ■< A" -< A 2 and Bx -< B 2 

Interpreting dist as the fuzzy distance, the fundamental equation of linear 
rule interpolation is introduced: 

d a (Ax,A m ) : d a (A',A 2 ) = d a (Bx,B ") : d a (B m ,B 2 ) 
where a € A„, U u A Bi u A Bj 

Applying the definition of d a for L and U separately, altogether 2IAI 
equations are obtained. These can be solved: 6 

. L . 1 - 

i aL (A' a J. j,.) 


611 


.... + 

sup<{B a } 1 ■ i 

d aU {Ai, a ,A-J _r d a u(K-Al.*) 

So the a-level set of the conclusion is given for every a: 


B‘ a = [«'n/ K {B*},sup^{B*} 1 

and so the fuzzy set B' can be constructed. Fig. 1 depicts a simple example 
for interpolating the codusion belonging to a non-overlapping observation. 
On Fig. 2.a the o-distance (lower) for two comparable fuzzy sets is indicated, 
b shows the fuzzy distance sets. 

It is possible to extend this idea to the interpolation of 2k rules, further 
on to various modified techniques of rule inter- and extrapolation. For more 
details, see (4, 5, 7, 8j. 


5 Approximation of the conclusion by regression 

A very difficult question is what happens if the rule base contains some in- 
ternal conflicts. An extremal example for this is if for any a and L or U there 
are two different rules in the base for which min/ max Ai i<, = min/maxA^a 
but min/maxBuc £ min/maxB i2a . Then, any ’interpolation’ results into 
a ’perpendicular’ line in 1 + 1 dimensions and no defined extrapolation out- 
side the two rules. Also, in the case of simply applying the interpolation 
technique for two flanking rules it is not clear, which of the two must be 
taken into consideration. 

Although there is no formal strict contradiction still we face conflict- 
ing evidence where the hypothetical approximation curve (e.g. polynomial 
interpolation) has a too large ’amplitude’ and the interpolated parts are 
very far from the area in X x Y where the actual rules are located. Fig. 
3. a presents a case with 6 rules where the approximation curve (using the 
extension of the above interpolation technique) fits the rules very well. In 
b however, the curve is rather different from the obvious behaviour of the 
rules, it goes outside of the ’rule area and is rather far from the expected 
U(x). In such cases, instead of eliminating conflicting rules the situation 
should be accepted, as it is and the solution should be looked for in the 
form of some compromise and simultaneous consideration of the conflicting 
rules. 


612 



How can conflicting rules be calculated with simultaneously? ^-possible 
technique is based on linear regression (see e.g. [12]). As the rule system is 
represented by a set of points in X x Y it is reasonable to compute the best 
fitting straight line by the least square method. In 1 + 1 dimensions this is 
defined by 


y = ax + b = 


~T, x i'£,yi/r 

XX-E*?/r 1 


+(£&■/»•-« £*.-/»•) 


It is much more complicated to treat compound variables. If X has Jfcj 
and y has k 2 components, the least square regression will result a ki x Jfc 2 
dimensional hyperplane. The problem can be always decomposed into k 2 
ki + 1-dimensional problems where Yi is approximated by ayr,- + 6,-. 
So it is sufficient to examine the case with compound X but simple Y. 

The solution of this problem is given by 


a — (<*] and a = 53 Vi/r — a r [^ x^] where 

•' 3 


a = ([*« - £ *o7»‘] 3 K - £ - £ *;i/r] r [y,- - 53 Vi/r] 

3 3 j i 

i = l...r, j = l...*i, [ ] stands for indicating a matrix, T is the transposed. 

It is clear that this regression line or hyperplane gives a very rough 
approximation of the rule base except if it has a really linear tendency. 
(See e.g. the rule base on Fig. 4.) So it is more reasonable to calculate 
y = ax + 6 only for a given environment of the observation: a ’window’ 
around the respective value of A *. Then, we obtain the best fitting straight 
line only for a restricted area. If the window is not too large, this leads to 
a rather good partially linear approximation. (See Fig. 5 for the mn» rule 
base.) 

Let us compare the window-regression technique with the previous inter- 
polation/extrapolation method. While in the latter it is sufficient to calcu- 
late the approximation curve (maybe partially linear) once before starting 
the inference/control algorithm based on the rules, it seems that because of 
shifting the window it is necessary to calculate a new equation for y every 
time when we have a new observation. This is painful, especially in the com- 
pound case as the matrix inversion takes a very long runtime. If this is true, 
the computational complexity of the newly proposed method is not compet- 
itive with other techniques. Luckily enough, in the case of simple variables 


613 


J 



and a rule base with r rules it is sufficient to calculate maximally 2 r different 
regression lines for any fixed set of points and even in the compound vari- 
able case the space X x Vj can be divided into maximally 2r* 1 areas where 
the regression hyperplane would be different, (a and lower/upper). Proof of 
this statement is not very difficult. Fig. 6 depicts a simple example how to 
divide X , for the rule base of the previous figures. A consequence is that 
when using trapezoidal rules and k t + k 2 variables, altogether Sk 2 r\ regres- 
sion hyperplanes are necessary before starting a real time control. So it is 
guaranteed that computational time during the actual control is not higher 
than in the case of straightforward approximation. 

Significant disadvantage of this technique is that the function obtained is 
not continuous: it is a broken line or broken plane and so the approximated 
conclusion might change abruptly when the observation is only slightly dif- 
ferent. (See Fig. 7 as illustration to this.) A solution of this problem is 
presented by the application of the fuzzy window technique, i.e. the above 
method is modified so that the environment of every observation has fuzzy 
rather than crisp boundaries. So the abrupt appearance of a new rule in 
the window when the observation is moved slightly is eliminated completely: 
every rule appearing in the window is weighted by the membership value 
attached to the location of that rule — depending on the location of the 
observation - this weight is however very small if the window is defined 
by a membership function smooth enough. For this purpose, a trapezoidal 
window is rather suitable. (See Fig. 8. The areas with p = 0 and 1 are 
indicated, in between, 0 < p < 1.) 

It is a new problem now, how the least square method works with 
weighted points. Clearly, the gain in the smoothness and continuity of the 
approximation function costs considerable computational time. Because of 
the introduction of the fuzzy (continuous membership function) window, no 
equivalent or extension of the above statement concerning the finiteness of 
the number of possible regression lines exists. The regression line calculated 
in terms of the observation is continuously changing. An exact examination 
of the computational complexity will follow. 

Instead of examining just the case of the fuzzy window regression we 
present the solution of the general fuzzy regression, where points can be 
weighted by arbitrary membership degrees. 

Suppose that we have points (*j,yj) (* = 1 ...r) and each has the mem- 
bership degree p,. The straight line with least square sum of difference is 
then 


614 



y 


ax + b = 


E «*«(*.• - 


E^*< 

Ep. 


a) 


Proof of this statement is by partial differentiation of the residual sum of 
squares according to a and b. 

Using the above, it is possible to approximate a L or U points of the 
conclusion in a highly flexible way: even flexible windows can be applied 
- as a matter of course with the computational time following from the 
above equations. It is not difficult to extend the above result for compound 
variable cases. Instead of giving the rather complicated equation we just 
indicate that the mean values Ei I u7 r and Ei Vi/ r must be replaced by 
Ej E, Pi and Ei /*•»•/ Ei/ 1 ** re sP- further on, in all the sums x, is 
replaced by It is a rather serious problem here that computational 
complexity is high, in every step of inference the inversion of several r x k\ 
dimensional matrices is to be done - depending on the cardinality of level 
sets at least 3 or 4 of them. 

A further direction of this research is that a 1 be calculated inside the 
crisp window, moreover, by using m for weighting, fuzzy variance is obtained 
which can be used for measuring the degree of conflict in the evidence of the 
given rule base - on level a and L or V. 


References 

[1] D. Dubois and H. Prade: Basic issues on fuzzy rules and thair application 
to fuzzy control. Fuzzy Control Workshop, IJCAI-91, Sydney, 1991. 13 
P- 

[2] D. Dubois and H. Prade: Gradual inference rules in approximate rea- 
soning. To appear in Information Sciences , 1992. 

[3] L. T. Koczy: On the computational complexity of rule base fuzzy infer- 
ence. NAFIPS - ’91, Columbia, Missouri, 1991. pp. 87-91. 

[4] L. T. Koczy and K. Hirota: Rule interpolation in approximate reasoning 
based fuzzy control. Proc. of Fourth IFSA World Congress , Brussels, 
89-92 (1991) 



[5] L. T. Koczy and K. Hirota: Rule interpolation by ot-level sets in fuzzy 
approximate reasoning. BUSEFAL 46 115-123 (1991) 

[6] L. T. Koczy and A. Juhasz: Fuzzy rule interpolation and the RULEINT 
program. Proc. Joint Hungarian - Japanese Symposium on Fuzzy Sys- 
tems and Applications, Budapest, 91-94 (1991) 

[7] L. T. Koczy, K- Hirota and A. Juhasz: Interpolation of 2 and 2k rules 
in fuzzy reasoning. Proc. of IFES ’92, Fuzzy Engineering toward Human 
Friendly Systems /., Yokohama, 206-217 (1991) 

[8] L. T. Koczy and K. Hirota: Reasoning by analogy with fuzzy rules. Proc. 
IEEE Int. Conference on Fuzzy Systems, San Diego, California 263-270 
(1992) 

[9] L. T. Koczy and K. Hirota: Interpolative reasoning with insufficient 
fuzzy rule bases. Submitted to Information Sciences (1992) 

[10] E. H. Mamdani and S. Assilian: An experiment in linguistic synthesis 
with a fuzzy logic controller. In: E. H. Mamdani and B. Gaines (eds.): 
Fuzzy reasoning and its applications. Academic Press, London, 1981. pp. 
311-323 

[11] I. B. Turk§en and Z. Zhong: An approximate analogical reasoning ap- 
proach based on similarity measures. IEEE Transactions on Systems, 
Man and Cybernetics 1049-1056 (1988). 

[12] S. Weisberg: Applied linear regression. Wiley, New York, 1980. 283p. 

[13] L. A. Zadeh: Outline of a new approach to the analysis of complex sys- 
tems and decision processes, IEEE Trans. Systems, Man and Cybernetics 
(1973). pp. 28-44. 

[14] L. A. Zadeh: Interpolative reasoning in fuzzy logic and neural network 
theory. IEEE International Conference on Fuzzy Systems, San Diego, 
California, 1992. 











/ 6 

GAUSSIAN MEMBERSHIP FUNCTIONS ARE MOST ADEQUATE 
IN REPRESENTING UNCERTAINTY IN MEASUREMENTS 

V. Kreinovich 1 , C. Quintana 1,2 , L. Reznik 3 

1 Computer Science Department, University of Texas at El Paso, E! Pa?o, TX 79968 USA 
2 Department of Electrical Engineering and Computer Science, 

University of Michigan at Ann Arbor, Ann Arbor, MI 48109-2122, USA 
3 Department of Electrical and Electronic Engineering, Footscary Campus, 

Victoria University of Technology, MMC Melbourne, VIC 3000, Australia 

Abstract. In rare situations like fundamental physics we perform experiments without knowing 
what their results will be. In the majority of real-life measurement situations, we more or less know 
beforehand what kind of results we will get. Of course, this is not the p;ecise knowledge of the 
type “the result will be between a - 6 and a + F, because in this case, we would not need any 
measurements at all. This is usually a knowledge that is best represented in uncertain terms, like 
“perhaps (or “most likely”, etc.) the measured value x is between a - S and a + 6”. 

Traditional statistical methods neglect this additional knowledge and process only the mea- 
surement results. So it is desirable to be able to process this uncertain knowledge as well. A 
natural way to process it is by using fuzzy logic. But there is a problem: we can use different 
membership functions to represent the same uncertain statements, and different functions lead to 
different results. What membership function to choose? 

In the present paper, we show that under some reasonable assumptions, Gaussian functions 
fi(x) = exv(-px 2 ) are the most adequate choice of the membership functions for representing 
uncertainty in measurements. This representation was efficiently used in testing jet engines for 
airplanes and spaceships. 

1. INTRODUCTION 

Usually in measurement situations there is some prior knowledge. In rare situations like 
fundamental physics we perform experiments without knowing what their results will be. In the 
majority of real-life measurement situations, we more or less know beforehand what kind of results 
we will get. Of course, this is not the precise knowledge of the type “the result will be between 
a — 6 and a + 6", because in this case we would not need any measurements at all. This is usually 
a knowledge that is best rej- esented in uncertain terms, like “perhaps (or “most likely”, etc.) the 
measured value x is between a — 6 and a + S”. 

Traditionally the uncertain prior knowledge is not used in measurement processing. 
Traditional statistical methods neglect this additional knowledge and process only the measurement 
results. So it is desirable to be able to process this uncertain knowledge as well. 

The usage of fuzzy logic and related problems. A natural way to process uncertainty is by 
using fuzzy logic [Z65]. This way we represent every statement of the type “most likely, \x - a| < 6” 
by a membership function fi{x) that for each x gives us a degree to which we are certain that this 
particular x satisfies the given condition. But there is a problem: we can use different membership 
functions to represent the same uncertain statements. What membership function to choose? 

What we are planning to do? In the present paper, we show that under some reasonable as- 
sumptions, Gaussian functions fi(x ) = exp(-/3x 2 ) are the most adequate choice of the membership 
functions for representing uncertainty in measurements. This representation was efficiently used in 
testing jet engines for airplanes and spaceships. 



618 



2. MOTIVATION OF THE FOLLOWING DEFINITIONS 


We must have in mind that different experts can have different opinions. Therefore 
the final resulting knowledge about the value of a physical quantity does not consist of a single 
statement, but can be formed by adding several statements of several expert, e.g., “most likely, 
|:r — ai| < <5i”, “most likely, |x-a 2 | < i 2 ”, — The resulting statement is “most likely, |x-ai| < , 

and most likely, |x - a 2 1 < d 2 ,...” In order to represent this resulting knowledge we must choose some 
operation * for &. Then the resulting membership function will be equal to fi(x) = /ii(x)*/r 2 (x)+ ..., 
where m(x) corresponds to the opinion of »-th expert. 

What & —operation to choose? Experimental results given in [HC76], [077], and [Z78], show 
that among all possible operations a, 6 — * min{a,b) and a, 6 — ► ah are the best fit for human 
reasoning. 

The min operation does not seem to be adequate for our purposes, because if we use min, then, 
e.g., the degree, to which a function x{t) satisfies the condition “for all t, most likely |x(t)| < M”, 
is equal to the minimal of the degrees of the statements “most likely, |x(t)| < Af” for all t. This 
minimum is attained when the value of |x(t)| is the biggest possible. Therefore, the function xi(t) 
that is everywhere equal to 2 M, gets the same degree of consistency with the above- given rule, as 
the function that is almost everywhere equal to 0 , and is attaining the value 2 M only on a small 
interval. Intuitively, however, for the first function, for which the inequality is not true in a single 
point, our degree of belief that x\ (t) satisfies this condition is practically 0, while for the second 
function, for which this inequality is almost everywhere true, our degree of belief must be close to 
1 . 

So, using min in our problem is inconsistent with our intuition, and therefore we must use the 
product for &. 

Comment. Other arguments for choosing different & operations are given in our previous publica- 
tions [KR 86 ] and [KQLFLKBR92]. 

We want to describe membership functions for the following statements. We are inter- 
ested in describing statements of the type “most likely, |x - a| < 6”, where x is unknown, and «, S 
are known values. So we must describe, to what extent any given value x satisfies this condition. 

All these membership functions can be obtained from one of them. Evidently, x satisfies 
the inequality |x — a| < S if and only if the value y = (x — a)/6 satisfies the inequality |y [ < 1. 
Therefore, it is natural to assume that the statement “most likely, |x - a| < 6” has the same degree 
of belief as the statement “most likely, |y| < 1”, where y = (:: - a)/S. So, if we will be able to 
describe a membership function fi(y) that corresponds to the statement “most likely, |y| < 1”, then 
we will be able to describe our degree of belief fii{x) that x satisfies the condition “most likely, 
|x - a| < 6” as /i((x - a)/6). So the main problem is to find an appropriate function p(x). 

What if we ask several experts. A statement “most likely, |x - a) < 6" means that an expert 
estimates x as a, and his own estimate of his precision is 6. Since such estimates are often very 
crude, it is reasonable to ask the opinion of several experts. After we have asked k experts, we get 
k statements of the same type: “most likely, |x - o,| < 6”, where i = 1 , 2 , and a,- and £,• are 
the estimates of the i-th expert. The corresponding membership functions are p((x - a;)/i,). 

Since all of them are experts, we believe in what all of them say, and therefore our resulting 
knowledge is: “most likely, |x - ai| < 61 , and most likely, |x - a 2 | < S 2 , and ...” Since we 
agreed to represent “and” as a product, the resulting membership function is equal to i/(x) = 
H{(x - ai)/6i)n((x - a 2 )/i 2 )...n({x - a k )/6 h ). 


619 



In case we have a precise knowledge, and each of the experts describes an interval, in which 
the unknown value x must be, the resulting knowledge is that x belongs to the intersection of all 
these intervals. This intersection is itself an interval, and therefore the only effect of asking several 
experts is that we decrease uncertainty. We do not change the form of the knowledge: it is still an 
interval, and in principle one smart expert could have named it from the very beginning. 

In a similar way, it seems reasonable to assume that in the general fuzzy case, by combining 
the opinions of several experts, we do not seriously add any additional knowledge; we may diminish 
slightly an uncertainty domain for the unknown x, but that's all. 

How to describe this argument mathematically: we must apply normalization. In math- 
ematical terms, we would like to postulate that the resulting membership function u(x) coincides 
with one of the functions p((x - a)6), and in principle it could represent the opinion of just one 
smart expert. 

We cannot, however, postulate precisely that. The reason is as follows. The bigger y, the 
smaller is our belief that “most likely, |y| < 1”. So, the function p(y) must be monotonously 
decreasing for y > 0. Its maximum m is attained, when y = 0. So, when we combine the two 
statements “most likely, |x| < 1”, and “most likely, |x - 0.3| < 1”, the resulting membership 
function u(x) = p(x)p(x - 0.3) is always smaller than m 2 , because both factors are < m, and for 
x 5^ 0 the first factor is < m, and for x = 0 the second. So even if m = 1, the function u(x) never 
attains m, and thus it cannot be equal to p((x — a)/S. 

The solution to this problem is well known in fuzzy logic: we can normalize v(x), i.e., turn 
from u(x) to i/'(x) = Nu(x), where the normalization constant N is equal to N = 1 /(max y i/(y)). 

Comment. A motivation for using namely this type of normalization is given in [KQLFLKBR92]. 

Now we are ready to formulate our demand. 

3. MATHEMATICAL FORMULATION OF THE PROBLEM 
AND THE MAIN RESULT 

Definition 1. By a membership function we will understand a continuous function p(x) from the 
set R of all real numbers into the interval [0,1]. 

Definition 2. We say that two membership functions p(x) and u{x) are equivalent if p(x) = Cv(x) 
for some constant C > 0. 

Definition 3. We say that a membership function p(x) is adequate for describing uncertainty of 
measurements if it satisfies the following conditions: 

• it is symmetric (p(—x) = p(x)), 

• p{x) is strictly decreasing on (0,oo) and tends to 0 as x -*■ oo 

• for every finite sequence of pairs (a!,^),^,^), — ,(<**,$*) there exist a and 6 such that the 
product p((x - a\)/S\ )p((x - a 2 )/i 2 )...p((x - ak)/h) is equivalent to p((x - a)(6). 

THEOREM. Any membership function, that is adequate for describing uncertainty of measure- 
ments, is equivalent to exp(—0x) for some 0 > 0. 

(The proof is given in Section 5). 

Comment. So we conclude that Gaussian functions are the only adequate membership functions. 
These functions are really widely used [K75], [BCDMMM85], [YIS85], [KM87, Ch. 5], etc. Al- 
ternative explanation of why Gaussian functions are used is given in [KR86] and in Section 8 of 
[KQLFLKBR92]. 


620 


4. HOW THIS RESULT CAN BE USED AND HOW IT WAS USED 

How it r«n be used. If for some physical quantity x several experts give their estimates at, a 2 , 
a*, and they estimate the precision of their estimates as correspondingly 61 , 62 , ..., 6 k, then the 
resulting membership function is equal to (i(x) = exp(-/?(x - a) 2 / 6 2 ), where 
6 = (i^ 2 + 62 2 + ... + $* 2 ) _1/2 and a = (ajif 2 + - + ak^ 2 )/(fj 2 + ••• + { * 2 )- 

Comments. 

1. These formulas can be easily obtained by explicitly computing p(x) as a result of normalization 
of the product /ii(x)/i 2 (x)...p fc (x), where m(x) = exp(-(3(x - ai) 2 / 6 j). 

2. These formulas are surprisingly identical with the statistical formulas that correspond to the 
case when we have Jfc statistical estimates a, with precisions 6 , and apply the least squares 
method ^(a - a,) 2 /^ 2 -+ mar, to get the resulting estimate for a. This is not such a big 
surprise, because least squares method is based on the assumption of a Gaussian distribution. 
The positive side is that not only the resulting formulas are extremely simple to implement, but 
maybe there is no need to implement them at all, because we can copy the existing statistical 
software. 

How this result was actually used. Expert estimates are extremely important in testing the 
jet engines. The reason is that an important part of this testing is trying to figure out what is 
going on in the high-temperature regions, and the temperatures are so high there that we cannot 
place any sensors. So the only available information about these regions consists of the experts’ 
estimates. 

One of the authors (L.R.) used this fuzzy representation of uncertainty in designing software 
for the automatized jet engines testing system IVK-12 [KR86]. This system was actually used to 
test jet engine for aircraft and spaceships. 

Possible other applications. One area where we believe this approach can be useful is when 
we determine the position of a Space Shuttle. The existing systems use several different types of 
sensors, with different precisions, and often with only experts estimates of that precision. In order 
to make appropriate control decisions we must combine these estimates into a single value. Fuzzy 
approach allows us to do that. 

5. PROOF OF THE THEOREM 

Comment. This proof contains some mathematical ideas from our previous publications [KR86] 
and [KQLFLKBR92]. 

1. Assume that p(z) is an adequate function in the sense of the above definition. It is easy 
to check that if n{x) is an adequate choice, then the result fi(x)/(max fi(y)) of its normalization is 
also an adequate choice. Since fi(x) is monotone, this maximum is attained for x = 0, and therefore 
the result of this normalization satisfied the condition ji(0) = 1. 

So, without losing any generality, we will further assume that jt(0) = 1. 

2. From the definition of an adequate function it follows, in particular, that fi(x)fi(x) = 
Cfi((x - a)/ 6 ) for some a,C and 6 . The left hand side attains its maximum (= 1) at x - 0, the 
right-hand side attains its maximum (that is equal to C) for x = a. Since these two sides are one 
and the same function, we conclude that a = 0 and C = 1, i.e., that /i 2 (x) = u(kjx) for some 
constant fc 2 (= 1/6). For /(x) = logft(x) we conclude that 2/(x) = /(fc 2 x). 

Likewise, if we consider 3, 4, etc terms, we conclude that 3I(x) = l(fc 2 x), 4i(x) = l(fc<x), etc. 


621 


3. The function p(x) for x > 0 is monotonously decreasing from 1 to 0. Therefore, l(x) is 
monotonously decreasing from 0 to — oo. Since p is continuous, the function l(x) is also continuous, 
and, therefore, there exists an inverse function i(x) = l~ 1 (x), i.e., such a function that i(l(x)) = x 
for every x. 

For this inverse function, the equality n/(x) = l(k n x) turns into i(nl(x)) = i(l(k n x)) = k n x = 
k n i(l(x)). So, if we denote l(x) by X , we conclude that for every n, there exists a k n such that 
t(nX) = k n i(X). 

If we substitute Y = nX, we conclude that i(Y) = k„i{Y/n ), and therefore, i(Y/n) = 
(l/k„)i(Y). 

From these two equalities, we conclude that i((mfn)X) = (l/k n )i(nX) = (k m /k n )i(X). So, 
for every rational number r, there exists a real number k(r) such that i(rX) = k(r)i(X). 

Therefore, the ratio i(rX)/i(X) is constant for all rational r. 

4. Since i(X ) is a continuous function, and any real number can be represented as a limit of 
a sequence of rational numbers, we conclude that this ratio is constant for real values of r as well. 
Therefore, for every real number r there exists a k(r) such that i(rX) = k(r)i(X). 

All monotone solutions of this functional equation are known: they are j(A’) = AX P for some 
A and p [A66]. Therefore, the inverse function /(x) (x > 0) also takes the similar form /(x) = Bx m 
for some k and m. Taking into consideration that p{x) and hence /(x) are even functions, we 
conclude that f(x) = 5|x] m for all x. 

5. Now, from the demand that a function p(x) is adequate, we conclude that for every a > 0 we 
have p(x — a)p(x + a) — C/i((x - ai )/S) for some aj and 6. The left-hand side of this equation is an 
even function, so the right-hand side must also be even, and therefore = 0. So, p(x-a)p(x+a) = 
Cfi(x/6). For x = 0 we get p(a)p(a) = C. Turning to logarithms, we conclude that for every a, 
there exists a k(a) such that /(x-a)+/(x + a) = /(fc(a)x)+2/(a). If we substitute here l(x) = fl|x| m , 
and divide both sides by B, we conclude that |x — aj m + jx + a| m = fc(a) m |x| m +2a m . 

6. When x > 0, and a is sufficiently small, then x + a, x, and x - a are all positive, and, 
therefore, (x - a) m + (x +a) m = k(a) m x m + 2a m . If we move 2a m to the left-hand side, and divide 
both sides by x m , we conclude that (1 -(a/x)) m + (l + (a/x)) m -2(a/x) m = Jfc(a) m . The left-hand 
side of the resulting equality depends only on z = a/x, the right-hand side only on n. Therefore, if 
we choose any positive real number A, and take a' = A a and x' = Ax instead of a and x, then we 
can conclude that the left-hand side will be still the same, and therefore, the right-hand side must 
be the same, i.e., k(a) m = k(Xa) m . Since A was an arbitrary number, we conclude that k(a) does 
not depend on a at all, i.e., k(a) m is a constant. Let us denote this constant by k. 

So the equation takes the form (l-x) m + (l + 2 ) ,n = k+2 z m . When z -+ 0, then the left-hand 
side tends to 2 and right-hand side to k, so from their equality we conclude that k = 2. 

The left-hand side is an analytical function of z for z close to 0. Therefore the right-hand side 
must also be a regular analytical function in the neighborhood of 0 (i.e., it must have a Taylor 
expansion for z = 0). Hence, m must be an integer. 

The values m < 2 are impossible, because for m = 0 our equality turns into a false equality 
2 = 3, and for m = 1 it turns into an equality 1 - 24 - 1+2 = 2+2, which is true only for 2 = 0. 
So m > 2. 


622 


Since both sides are analytical in x, the second derivatives of both sides at s = ® must be 
equal to each other. The second derivative of the left-hand side at z = 0 is equal to "i( no- *)• The 
second derivative of the right-hand side is equal to 2m(m - 1 )z m ~ 2 . Um>2\ then this derivative 
equals 0 at 0 = 0 and therefore cannot be equal to m(m - 1). So m > 2, and m camot be greater 

than 2. Therefore, m- 2. 

So, I(x) =* Bx 2 , and hence p(x) = exp(-/3x 2 ) foT some fi > 0. Q.E.D. 

6. CONCLUSIONS 

How to represent in mathematical terms uncertain numeric statements about the walue x of 
a physical quantity, e.g., statements of the type “most likely * is between a - * and a + 6”? 
Reasonable arguments lead us to the conclusion that the most adequate membership functions for 
such statements are Gaussian functions p(x) = exp(-/3(x - a) 2 f6 2 ). 

If we use these membership functions, then we can apply simple algorithms to combines the 
opinions of several experts. Namely, if k experts give estimates a,,...,a fc , and thw estimate the 
precision of their estimates as correspondingly 61 , 62 , ...» 6 k, then the resulting membsrshiqj function 
is equal to p(x) = exp(-/5(x - a) 2 / 6 2 ), where 6 = (tff 2 -M 2 ~ 2 + ... + K )’ 1/2 and 

a = (Mr 2 + - + )/(■ *1“ + - + K 1 )• 

These formulas coincide with the ones that result from applying the statistical least squares 
method, so we do not even have to write a new software. 

This approach was applied to testing jet engine for aircraft and spaceships, and at may be 
useful in many other applications, e.g., in combining the results of several coordinate and distance 
sensors in spaceship navigation. 

ACKNOWLEDGEMENTS 

This work was started with the support of the Soviet Space Shuttle Program and was continued 
under NSF Grant No. CDA-9015006, NASA Research Grant No. 9-482, and the Institute for 
Manufacturing and Materials Management grant. The authors are greatly thanMiil to Harold 
Brown (GE Aircraft Engines) and Bob Lea (NASA Johnson Space Center) for inspuuxg discussions. 

REFERENCES 

[A66] J. Aczel. Lectures on functional equations and their applications. Academe Press, N.Y. 
and London, 1966. 

[BCDMMM85] G. Bartolini, G. Casalino, F. Davoli, M. Mastretta, R. MinriardL, and E. 
Morten. Development of perfomance adaptive fuzzy controllers with applications to (continuous 
casting plants, in: M. Sugeno (editor). Industrial applications of fuzzy control, Sbrtfli Holland, 
Amsterdam, 1985, pp. 73-86. 

[HC76] H. M. Hersch and A. Caramazza. A fuzzy-set approach to modifiers and vagueness in 
natural languages. J. Exp. Psychol.: General, 1976, Vol. 105, pp. 254-276. 

[K75] A. Kauffman. Introduction to the theory of fuzzy subsets. Vol. 1. Fundamental theoretical 
elements, Academic Press, N.Y., 1975. 

[KQLFLKBR92] V. Kreinovich, C. Quintana, R. Lea, 0. Fuentes, A. Lokshrn, S. Kumar, I. 
Boricheva, and L. Reznik. What non- linearity to choose? Mathematical foundttioms of fuzzy 
control. Proceedings of the 1992 International Fuzzy Systems and Intelligent Contra! Conference, 
Louisville, KY, 1992, pp. 349-412. 


623 


[KM87] R. Kruse and K. D. Meyer. Statistics with vague data. D. Reidel, Dordrecht, 1987. 

[077 ] G. C. Oden. Integration of fuzzy logical information , Journal of Experimental Psvchol 
ogy: Human Perception Perform., 1977, Vol. 3, No. 4, pp. 565-575. P y - ' 

fnn Jf R86] V - Kre ; n ° vich and L - K - Reznik - Methods and models of formalizing prior information 
(on the example of processing measurements results ). In: Analysis and formalization of computer 
experiments. Proceedings of Mendeleev Metrology Institute, Leningrad, 1986, pp. 37-41 (in Rus- 

[YIS85] 0. Yagishita, 0. Itoh, and M. Sugeno. Application of fuzzy reasoning to the water 

*''***"' 01 ^ <*** N °« h HoUand. 

[Z65] L. Zadeh. Fuzzy sets. Information and control, 1965, Vol. 8, pp. 338-353. 

[Z78] H. J. Zimmermann. Results of empirical studies in fuzzv set thenm Tn- 4 n . 
Sy»Um teseanh (G. J. Klir, ed.) Pl«,„ m , Ne w York, 1978^303-312 


624 



Paper Not Submitted 
in Time for Publication 



625 




UNCLAS 







4 • 








a 


33 ?-£ 5 " 

0 A 


N93-29585 

Life Insurance Risk Assessment Using a 
Fuzzy Logic Expert System 


Luis A. Carreno and Roy A. Steel 
Togai Infralogic, Inc. 


Abstract 

In this paper, we present a knowledge based system that combines fuzzy 
processing with rule-based processing to form an improved decision aid for evaluating 
risk for life insurance. 

This application illustrates the use of Fuzzy CLIPS to build a knowledge based 
decision support system possessing fuzzy components to improve user interactions and 
KBS performance. The results employing Fuzzy CUPS are compared with the results 
obtained from the solution of the problem using traditional numerical equations. The 
design of the fuzzy solution consists of a CLIPS rule-based system for some factors 
combined with fuzzy logic rules for others. This paper describes the problem, proposes a 
solution, presents the results, and provides a sample output of the software product 

1.0 Introduction to FuzzyCLEPS 

Fuzzy CLIPS adds fuzzy processing capability to CLIPS 5.1. The architecture is a 
separate processing element similar to that used to incorporate object-oriented 
programming into CLIPS. The basic fuzzy constructs and function calls can be written 
intermixed with usual CLIPS statements. Principal fuzzy constructs define rule bases and 
membership functions. A fuzzy membership function can be associated with a universe 
of discourse. This improvement allows readable terms such as "high" and "low” t be 
used in different contexts. There are also functions by which a CLIPS program can test 
the degree of membership of a sensor value, execute a fuzzy rule base that returns 
defuzzified control values to CLIPS and, optionally, assert facts giving belief values for 
the possibilities that might be useful in an expert system. In addition, C interface 
functions support embedded fuzzy applications that can invoke the fuzzy processor 
directly for speed in embedded control applications. FuzzyCLEPS is designed to be 
compatible with future CLIPS versions. Like CLIPS, it can operate as a stand alone 
program or be embedded in a larger application. 

2.0 Problem Statement 

An insurance company needs to assess the degree of health risk with 

each client based on physical characteristics such as height, weight, and age and exercise, 
smoking, drinking, and eating habits. The output risk value serves as the basis for the 
determination of insurance premiums billed to clients. Those premiums have a base rate 
(perfect health, good habits, 35 years old) and an increment to adjust the premium based 
on the risk. A system that produces a risk value between 0.0 and 1 suffices to set a net 
rate. The equation is 

Cost to Insure 

Client = Base Rate + ((Risk /Base Risk)- 1 )*Increir.ent m 


R97 


“ JMU IUJ*!!PPP 




sn 


The relation between decision factors and the rare change need be neither incremental nor 
linear, i.e., separate consideration of the decision factors may not determine a change in 
rate that can be simply summed to determine the net rare. This means that the questioning 
of the client must be controlled; it makes no sense to continue to ask a client about all 
factors if a decision on rates is possible at some intermediate point in the interaction. 
Complex nonlinearity and interdependence of the factors mean that computer-based 
decision aids are useful to a human agent and that sharp decision boundaries such as 
those produced by a normal rule based system are sensitive to small uncertainties in the 
input data. Fuzzy logic provides a basis for accommodating such uncertainty with 
finesse. 

The input variables of the system are of two different types: base and incremental. 
The £sc type of input variables are Age (A), Weight (W), and Height (H). A derived 
internal variable is the body mass index (BMI) that estimates fitness or body fat content. 
Incremental input variables deal with particular habits and characteristics of prospective 
clients. The following are considered such variables in the present example exercising 
(E), dairy products intake (DI), red meat intake (MI), vegetable intake (VI), fat/sweet 
intake (FSI), smoking (S), and drinking (D). The output of the system is the degree of 
risk (R). 


3.0 Traditional Numerical Solution 

For the traditional method solution, we treat all of the variables as a number input 
or a selection from a finite, discrete, closed set of possibilities. Each variable is 
represented as a lookup table of intervals where the value of the corresponding is 
specified for each interval. The following table presents the values of the contribution to 
risk due to Age, 

a ge a ge-ri sk 

0 to 30 0.25 

31 to 60 0.5 

61 to 90 0.75 

>90 1.0 

We note that this table could be used in a rule-based knowledge system (KBS) to provide 
rules of the form 


(age ?age&:( <= ?age 30) => (assert (age-risk .25)) 

(age ?agel&:( > ?agel 30)) 

(age ?age2&:(<= ?age2 60))(test (!= ?agel ?age2)) 

=> (assert (age-risk = 0.5)) 

etc. 


For discrete selections, the table contains the risk value assigned to each value. An 
example corresponding rule is 




(smoke- habit ?input&((eq ?input 0)ll(eq ?input S)...)) 

=> (assert (smoke-factor 0.25)) 

When each factor has been evaluated, the total risk is evaluated as a weighted 
combination of die risks due to various factors where die value of the weights provide 
another knowledge component of the decision support system. 

3.1 Body Mass Index 

The inputs. Height and Weight, are used to obtain the body mass index (BMI). 
This measure determines if a person is overweight or not BMI is calculated by dividing 

the Weight in kilograms by the square of the height in meters, BMI = Weight/(Hcight)^. 

The following table shows the scale used to measure BMI and the corresponding 
BMI-risk that is used later to calculate risk. 

BMI Condition BMI-risk 


under 23 

Underweight 

0.25 

23-25 

Ideal 

0.0 

25-30 

Overweight 

0.75 

over 30 

Obese 

1.0 


3.2 Mathematical Model for Traditional System 

In a traditional system, the first step in the solution of the problem is to define a 
mathemati cal relation between the inputs and outputs of the system. The objective is to 
obtain a numerical value that represents the risk of a person having medical problems due 
to his physical characteristics and eating habits. Risk is defined as having a range of 
[-0.357,1]. The various factors are also assumed to have values in the [0,1] range by 
mappings similar to those presented above for age and BMI. A risk measure of 1 
represents the maximum degree of risk, on the contrary, a measure of 0 or less represents 
the minimum degree of risk. 

In general. 

Risk = w gj^j*(BMI-risk) + w§*(Smoking-risk) + wj)*(Drinking-risk) + 

W£* (Exercise-risk) + wy j* (V egetarian-risk) + wjjj*(Dairy-Products- 
Intake-risk) + w^jj*(Red-Meat-Intake-risk) + wp§j*(Fat/Sweet-Intake- 
risk) + w^*(Age-risk) (2) 

Constants wg and wyj are negative because they reduce total risk. The other weights are 

expected to be positive. Values of die weights are based on the corresponding factor’s 
effect on the overall degree of health of a person. 

3.3 Effects of Habits (Incremental Inputs) 

In addition to age and BMI, factors reflecting a person's habits contribute to risk 
assessment These are generally harder to quantify and are often described by qualitative 
terms such as "I smoke a little" or "I eat lots of vegetables." There are two approaches 


K 




that are used to handle such data. The normal one is to attempt to quantify the habit in 
terms of frequency of participation and quantity of material, time, or activity concerned, 
much as the scientist who studies effects of various habits on health risk quantifies inputs 
to the evaluation experiments. The other approach is tc classify estimates of activity 
frequency and level into literal categories from options to the respondent For example, 
exercise might be analyzed from a more complicated user interface 


Level 


_aerobic 

.strength building 
_other 

_active work 
do not exercise 


Frequency 
-very frequent 
-frequent 
-sometimes 
.occasionally 
-never 


JEypfi. 

.walking or treadmill 
Jogging 
-lift weights 
.exercise machine 
_water sports 
-team sports 
.skating 
_skiine 


The disadvantage of such an approach is that a need for understanding the respondent's 
meaning for a term means ambiguity in the input data and stress for the respondent in 
deciding which category fits his case. In general, more complex interfaces are required to 
provide s ufficient detail or correlations from which to extract information about whether 
the user understands or is trying to bias answers in his favor. 

A user interface in which the user chooses values for frequency and intensity 
against an arbitrary scale (e.g., "on a scale of 1 to 10, how much do you drink?) 
introduces the potential to fuzzify the input to conduct reasoning with correlation and 
interpolation between benchmarks or way points. 

Qualitative values indicating the change in risk due to various habits is shown 

below. 



Rklr Increases 

Health Risk 

Neutral 

Risk Decreases 

Smoking 

High 

Med 

None 

Drinking 

High 

Med 

None 

Exercising 

Low 

Med 

High 

Vegetable Intake 

Low 

Med 

High 

Red Meat Intake 

High 

Med 

Low 

Dairy Intake 

High 

Med 

Low 

Fat/Sweet Intake 

High 

Med 

Low 

4.0 Fuzzy Logic Solution 





In a fuzzy logic based system, an expert defines the rules. Such rules are used to 
describe the characteristics of the risk assessment for each factor. Later on, the input 
variables are matched against the set of rules to produce the appropriate output Each one 
of the fuzzy variables contributes to the output of the system depending on how many 
rales are fired for each particular input variable. Fig. 1 depicts a schematic fuzzy decision 
support system. For fuzzy reasoning we use a max-dot inferencing technique, and 
centroid defuzzification technique. 


630 


For this partial*"- example, four different sets of fuzzy roles are defined. The first 
rolebase relates a risk_l to age and BMI. The second rulebase relates a risl^_2 to smoldng 
and drinking habits. The third rolebase relates a risk_3 to the amount of exercise and 
intake of vegetables. The last rulebase relates a risk_4 to intake of dairy products, red 
meat, and fat and sweet products. A fifth rolebase relates risks 1-4 to the overall risk to 
complete the risk assessment. The importance of breaking down the problem into smaller 
related groups is the fact that the number of roles needed to control the system decreases 

dramatically. In our example, the number went down from 4* 3 7 (8748) rules to a 
maximum of 3 13 rules. 

After cal c ulati ng the BMI and having obtained the age from the user interface, an 
initial measure of risk, risk_l, is obtained. This measure serves as the basis for 
subsequent decisions. If the risk obtained is considered by the system as very high, no 
further inquiries of the user are necessary. On the other hand, if the risk obtained is 
considered low, medium, or high, further inquiries into the client’s habits are necessary to 
produce a more meaningful result. 

The output of the system consists of a crisp value for Risk in the range [0,1]. The 
system also produces a truth value associated with each output fuzzy set, i.e., the degree 
to which each fuzzy set defining risk contributes to the output value of risk. 



Fig.l A schematic view of the fuzzy logic risk assessor 


631 








4.1 Membership Functions a 

In order to solve the problem using fuzzy logic muhods, we defined sets of 

membership functions associated with each variable 

A Lo, Med, Hi 

BMI Under, Ideal, Over, Obese 

Risk_n Low, Medium, High, Very High 

The universe of discourse for each of the above fuzzy variables is [0,1] for each risk 
(Fig, 2), [0,40] for BMI (Fig. 3), "and [0,100] for Age (Fig. 4). 


Low Med Hi Very Hi 



Fig. 2 Risk Membership Functions 


Under 


Ideal Over Obese 



Fig. 3 BMI Membership Functions 


Lo Med Hi 



““ o 50 100 Age(A| 

Fig. 4 Age Membership Functions 
4.2 Rules 

A sample of the fuzzy logic rule set for Risk, based on all the inputs as a whole, 
can oc seen in the following table. 


632 


IVIMtfc 



Lo 

Ideal 

Med 

Hi 

Med 

Lo 

Lo 

Lo 

Lo 

Low 

Lo 

Ideal 

Hi 

Med 

Lo 

Lo 

Lo 

Lo 

Lo 

Ix> 

Ideal 

Lo 

Hi 

Lo 

Lo 

Med 

Lo 

Lo 


Med 

ideal 

Lo 

Lo 

Hi 

Hi 

HI 

Lo 

Lo 

Medium 

Med 

Over 

Med 

Med 

Lo 

Lo 

Med 

Med 

Lo 

Lo 

Over 

Lo 

Lo 

Med 

Hi 

Hi 

Lo 

Med 


Med 

Obese 

Lo 

Hi 

Lo 

Lo 

Hi 

Hi 

Lo 

High 

Med 

Over 

Lo 

Lo 

Med 

Hi 

Hi 

Hi 

Med 

Hi 

Over 

Lo 

Lo 

Hi 

Med 

Med 

Med 

Med 


Med 

Obese 

Lo 

Lo 

Lo 

Hi 

Hi 

Med 

Hi 

Very 

Hi 

Obese 

Lo 

Lo 

Hi 

Hi 

Hi 

Hi 

Med 

High 

Hi 

Over 

Lo 

Lo 

Hi 

Hi 

Hi 

Hi 

Hi 


As explained earlier, a rulebase with that many inputs is difficult to implement 
due to the large number of possible combinations of the input variables. Examples of 
fuzzy rules, using the alternative approach of breaking down the input variables into 
smaller and related groups, is shown next. 


IF A is Hi and 
BMI is Gbese 

THEN Risk_l is Very High 

IF E is Hi and 
VI is M 

THEN Risk 3 is Low 


IF S is HI and 
DisL 

THEN Risk_2 is High 

IF MI is L and 
DI is M and 
FSIisM 

THEN Risk 4 is Medium 


In the application, five rulebases are defined. As explained earlier, each one produces a 
partial risk that is merged at the end of processing to produce a final assessment of risk. 
Such risk is compared with an ideal risk called base risk. The base rirk is the risk 
a-sociated with a 35 year old with the following physical characteristics and 
drinking/eating habits ideal BMI, non smoker, low consumption of alcoholic drinks, low 
consumption of dairy products, red meat products, and fat/sweet products, high 
consumption of vegetables, and high amounts of exercise. The total risk of a particular 
person is calculated and substituted in Eq. (1) to produce a premium amount 


4.3 User Interface 

There are two special cases in the processing of the problem. First, if the initial 
risk, based on age and BMI, is greater than 0.8 the risk is considered very high. 
Therefore, no need for further processing of the system. Second, if the initial assessment 


7 


633 


of BMI is greater than 30, meaning the person is obese, questions related to the habits of 
consumption of dairy products, red meat, and fat/sweet products are omitted. Otherwise, 
the user interface is die same as that for the numerical method. 

5.0 Results and Conclusions 

To compare the methods, sample data was created and processed by both versions 
of the program. The sample data consists of a group of persons with the same eating and 
exercise habits, the only variant is the age of the individuals. The constant characteristics 
can be seen in the following table. 

BMI S D VT FSI E PI MI 

Ideal no L H L H L L 

The values of age used were in the range [20, 100]. The results were as expected. For the 
traditional method, we can see abrupt changes in the value of risk associated with ages at 
the edges of the intervals, as observed in figure 5, the value of risk jumps from age 30 
and then continues constant until it reaches the age of 60 where it jumps again. The 
process is repeated at age 90. 

For the fuzzy logic solution, as observed in figure 5, no sharp differences are 
produced at any specific age, i.e., the values of risk increase smoothly along the whole 
universe of discourse. The fuzzy system produces more realistic values for different ages, 
specially for those cases in which the ageuvaries from 30 to 31, 60 to 61, or 90 to 91. 



Agel 


Age 2 


Fig. 5 Risk for Traditional Vs Fuzzy Logic Method 


634 


6.0 Sample Output 

The application program described in this document, was written using the alpha 
version oi FuzzyCLIPS. It generates an interactive session, in which the user is 
questioned in order to gather information about a client's physical characteristics, 
exercise habits, and eating and drinking habits. 

After receiving all of the information needed, the partial values of risk are determined, 
and a final summary report is produced. It consists of the four partial risks and its values, 
the total value of risk, the value of the base risk, explained earlier, the ratio of the total to 
base risk, the annual insurance premium, and the individual contributions of each 
membership function by risk and its predicate values. 
************************************************* 

SUMMARY 

************************************************* 

Risk based on 


age and bmi -■ ■=--"—= > 0.318 

smoking/drinking — - — 0.600 
exercisc/vegetable intake ===> 0.400 

fat intake — ==- > 0.842 

************************************************* 

total risk => 0.547 
BASE RISK =>0.344 

************************************************* 

RATIO total/base risk => 1.59 
************************************************* 

YOUR ANNUAL PREMIUM IS => $ 1941.13 
************************************************* 

INDIVIDUAL MBF CONTRIBUTIONS BY RISK 
************************************************* 


Fat intake Risk => MBF VH 
Exer/Veggies Risk =>MBF H 
Exer/Veggies Risk =>MBF M 
Smoke/Drink Risk =>MBF H 
Smoke/Drink Risk — >MBF M 
Age/BMI Risk ==> MBF M 
Age/BMI Risk ==> MBF L 


Degree of Truth 1.0 
Degree of Truth 4.2e-005 
Degree of Truth 0.99 
Degree of Truth 0.99 
Degree of Truth 4.2e-005 
Degree of Truth 0.587 
Degree of Truth 0.412 


************************************************** 


7.0 References 

1. FuzzyCLIPS Reference Manual, Vol. 1, Basic Programming Guide, Alpha Release, 
Oct 19, 1992. 

2. CLIPS Reference Manual, Vol 1, Basic Programming Guide JSC-25012, NASA/JSC 
Sept 10, 1991. 

3. The New Good Housekeeping Family Health and Medical 

Guide, section three, p. 608-615, (Hearst Corporation, New York, 1989). 


635 





AUTHOR INDEX 


Aazhang, Behnaam 589 

Barone, Joseph M 205 

Bezdek, James C 98 

Bogdan, 1 350 

Bose, Patrick 478 

Buhusi, Catalin V 360 

Carreno, L A 627 

Chang, Ching-Chuang 418 

Chen-KuoTsao, Eric 98 

r Chiang, Jung-Hsien 257 

Chiu, Stephen 239 

* Chung-Hoon Rhee, Frank 81 

Cohen. M. E 535 

Copeland, Charles 154 

Cutelio, Vincenzo 215 

Dasarathy, oelur V 368 

Daugherty, Walter C. 143 

Dekorvin, Andre 398 

Deutsch-McLeish, Mary 266 

Dockery, John T. 248 

Filev, Dimitar P 135 

Fleischman, Robert M 496 

Fleming, J. W 524 

FHkop.Ziny 378 

Frigui, Hichem 59 

Gader, Paul 257 

GantnerT. E 427 

Gebhardt, Jorg 296 

Goizalczany, Marian B. 266 

Grantner, Janos 312 

Gutierrez-Martinez, Salvador 437 

Henson, Troy F 589 

Holden, A. V 545 

Hu, Yong Un 488 

Huang, Song 108 

Huang, X. H 524 

Hudson, D. L 535 

Huxhold, Wendy L 589 

Janabi, Talb H 340 

449 

Joslyn, Cliff 458 

Juang, C. H 524 

Karpovsky, Efin Ja 224 

Karr, C. Lucas 506 

Karwowski, Waldemar 1 

• Keller, James M 49 

' Keller. Jim 468 

Klir, George 598 

, Koczy, Laszto T. 608 

Kraslawski, A 408 

Kreinovich, V. 418 

618 

Krishnapuram, Raghu 49 

59 

81 

Krowidy, Srinivas 488 

Kruse, Rudolf 296 

388 

Labos, A. S 545 

Labos, E 545 


Laczko, J 545 

lara-Rosano, Felipe 304 

Lea, Robert N, 154 

398 

515 

Lee, Hon-Mun 125 

Lee, Jonathan 185 

Leigh, Abert B 69 

Urn, P.Y.W 427 

Uttman, David 248 

Ma,YI>ing 49 

Madyastha, Raghavendra K 589 

Mansfield, W. H 496 

McAllister, Luisa 555 

Mara, Sunanda 166 

Mohamed, Magdi 257 

Montero, Javier 215 

Nair, SatishS 118 

Nasraoui, Oifa 59 

Natarajan, Swami 185 

Nauck, Detlef 296 

388 

Niskanen, Vesa A 625 

Nolan, Adam 488 

Nystrom, L 408 

Ostaszewski, Krzysztof 1 

Pagni, Andrea 195 

PaL Nikhil R 98 

PaLSankarK 69 

Parlos, Alexander 125 

Parviz, Behzad 598 

Patyra, Marek J 312 

565 

Pavel, Sandy 575 

Pedrycz, W. 322 

581 

Pemmaraju, Surya 166 

Plluger, Nathan 185 

Pivert, Olivier 478 

Poiuzzi, R 195 

Prade, Henri 7 

Quintana, C 618 

Ramamoorthy, P. A 108 

Ramer, Arthur 14 

Reznfe, L 418 

618 

Rizzotto, G. G 195 

Roventa, Eugene 581 

Rueda, A 322 

Ryjov, Alexander P 21 

Schott, Brian 330 

Seniw, David 175 

Shehadeh, Hana 515 

Shieh, C-Y 118 

Shipley, Margaret F 398 

Sotopchenko, G. N 418 

Stachowicz, Marian S 312 

Steel, R. A 627 

Steiniage, Ralph C 427 


A-1 


Sultan, L H. 


Sztandera, Leszek M. 

Tahani, Hossein 

TavatoH, Nassrin — 

Teodorescu, H. N 

Han, Y 

Tsai, Wei K 

Tuifcsen, I. B 


Vaidya, Nitin 


340 
449 
89 
468 
175 
,350 
. 29 
125 
. 29 
276 
286 
. 39 


Villarreal, James A. 


>•*••••••«•••••• 154 

Wang, Haojin 


143 

Wee.WillaimG 


»••••• •*•«•••••• 48i 

Whalen, Thomas 


229 



330 

Willson, Ian A. 





286 

Vanor RnnalH R 


7 

Yashvant, Janl 


154 

Yen, John 


39 



143 



185 


A-2 





REPORT DOCUMENTATION PAGE 


Form Approved 
OM8 No. 070*4188 


M«c raoorttna burden tor thn aMerton o t Information * animated to image 1 hour par rewonte. Including the time lor tmriewmg Inttrumom. narchlng nHDng data Murtet, gathering and 
jinineiijLiiLu thi dan needed and completing and reviewing the cobuctton ol Information. Sand com menu regarding did burden etthnata Or any other aapact of thd abaction of Information, 
i — p-an n ,„»yai m. nv reducing thd burden, to wadhnoton liaada n at ten Servicer. Ofrettorate lor Information Operation and bapom. IJ1S Jefhrrton Davd Highway, State U04. Arlington, VA 
jS^TSSwdmOffkJ of Ma«gement and gudget, Paperwori. heduttion Profett (0704-0 IM). Wathlngton. DC 30503. 


1. AGENCY USE ONLY (tMve blank) 


2. REPORT DATE 

December 1992 


3. REPORT TYPE AND OATES COVERED 

Conference Publication 


4. TITLE AND SUBTITLE . _ „ , , , _ _ , 

NAFIPS '92 North American Fuzzy Information Processing 
Society Volume 2 


6. AUTHOR(S) , /aiae*a\ 

James Villarreal, compiler (NASA) 


7 PERFORMING ORGANIZATION NAME(S) ANO AOORESS(ES) 

London B. Johnson Space Center 
Houston, Texas 77058 



8. PERFORMING ORGANIZATION 
REPORT NUMBER 

S-702 


9 SPONSORING / MONITORING AGENCY NAME(S) AND ADDRESSES) 

National Aeronautics and Space Administration 
Washington, D.C. 20546 


10. SPONSORING /MONITORING 
AGENCY REPORT NUMBER 

CP 10112 



12a. DISTRIBUTION / AVAILABILITY STATEMENT 

Uni Imlted/Unclassl fled 
Subject Category 59 


12b. DISTRIBUTION CODE 


This document contains papers presented at the NAFIPS '92 North American Fuzzy 
Information Processing Society Conference, held at the Mel la Hotel Paseo de la Marina 
Sur Marina Vallarta In Puerto Vallarta, Mexico, on December 15-17, 1992. More than 75 
papers were presented at this Conference, which was sponsored by NAFIPS In cooperation 
with NASA, the Instltuto Tecnologlco de Morelia, the Indian Society for Fuzzy 
Mathematics and Information Processing (ISFUMIP), the Instltuto Tecnologlco de Estudlos 
Superlores de Monterrrey (ITESM), the International Fuzzy Systems Association (IFSA), 
the Japan Society for Fuzzy Theory and Systems, and the Microelectronics and Computer 
Technology Corporation (MCC). 

The fuzzy set theory has led to a large number of diverse applications. Recently, 
Interesting applications have been developed which Involve the Integration of fuzzy 
systems with adaptive processes such as neural networks and genetic algorithms. NAFIPS 
•92 was directed toward the advancement, commercialization, and engineering development 
of these technologies. 


Fuzzy Systems, Neural Networks, Genetic Algorithms, Optimization, 
Pattern Recognition, Path Planning, Robotics, Information 
Processing and Vision, Decision Analysis, Control Systems 


17. SECURITY CLASSIFICATION 
OF REPORT 

Unclassified 


18. SECURITY CLASSIFICATION 
OF THIS PAGE 

Unclassified 


19. SECURITY CLASSIFICATION 
OF ABSTRACT 

Unclassified 


IS. NUMBER OF PAGES 


16. PRICE CODE 


20. LIMITATION OF ABSTRACT 

Unlimited 













DO NOT REMOVE SLIP FROM MATERIAL 

Delete your name from this slip when returning material 
to the library. 

NAME 

DATE 

MS 

M. Kltf/f/iuWi 



i 































NASA Langley (Rev. Dec. 1991) RIAD N-75 





